Beamforming-based Speech Enhancement based on Optimal Ratio Mask - Details

Author：

Ji, Qiang (Ji, Qiang.) | Bao, Changchun (Bao, Changchun.) (Scholars：鲍长春) | Cheng, Rui (Cheng, Rui.)

Indexed by：

CPCI-S

Abstract：

Speech　enhancement　in　the　noisy　and　reverberant　environment　remains　a　challenging　task.　Acoustic　beamforming　algorithm　with　minimum　variance　distortionless　response　(MVDR)　has　shown　to　be　effective　for　this　case.　The　crucial　issue　in　MVDR-based　speech　enhancement　is　to　get　accurate　estimates　of　the　speech　and　noise　spatial　covariance　matrices　(SCMs).　On　this　way,　time-frequency　mask-based　method　which　is　a　reliable　method　to　estimate　the　SCMs　can　improve　the　performance　of　the　MVDR　beamformer　in　speech　enhancement.　In　this　paper,　an　optimal　ratio　mask-based　method　used　for　MVDR　beamforming　is　proposed.　Specifically,　the　convolutional　neural　networks　(CNNs)　is　used　in　the　proposed　method,　which　operates　on　the　magnitude　and　phase　components　of　the　short-time　Fourier　transform　(STFT)　of　microphones　to　estimate　the　optimal　ratio　masks,　and　these　masks　are　used　to　get　the　SCMs　for　constructing　MVDR　beamformer.　Experiments　are　conducted　by　using　simulated　data.　The　results　show　that　the　proposed　method　is　more　robust　than　the　reference　methods　against　the　terrible　acoustic　conditions.

Keyword：

time-frequency mask Speech enhancement neural networks beamforming

Author Community：

[ 1 ] [Ji, Qiang]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China
[ 2 ] [Bao, Changchun]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China
[ 3 ] [Cheng, Rui]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China

Reprint Author's Address：

[Ji, Qiang]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China

Email：

jigiang@emails.bjut.edu.cn |
baochch@bjut.edu.cn |
chengrui@emails.bjut.edu.cn

Show more details

Related Keywords：

Masking Estimation with Phase Restoration of Clean Speech for Monaural Speech Enhancement
2019，INTERSPEECH 2019
Beamforming-based speech enhancement based on optimal ratio mask
2019，2019 IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2019
Multi-channel Speech Enhancement with Multiple-target GANs
2020，10th IEEE International Conference on Signal Processing, Communications and Computing (IEEE ICSPCC)
Joint ideal ratio mask and generative adversarial networks for monaural speech enhancement
2018，14th IEEE International Conference on Signal Processing, ICSP 2018

Source ：

CONFERENCE PROCEEDINGS OF 2019 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2019)

Year： 2019

Language： English

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 2

Affiliated Colleges：

信息学部

Get Fulltext

Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to