收录:
摘要:
Speech enhancement in the noisy and reverberant environment remains a challenging task. Acoustic beamforming algorithm with minimum variance distortionless response (MVDR) has shown to be effective for this case. The crucial issue in MVDR-based speech enhancement is to get accurate estimates of the speech and noise spatial covariance matrices (SCMs). On this way, time-frequency mask-based method which is a reliable method to estimate the SCMs can improve the performance of the MVDR beamformer in speech enhancement. In this paper, an optimal ratio mask-based method used for MVDR beamforming is proposed. Specifically, the convolutional neural networks (CNNs) is used in the proposed method, which operates on the magnitude and phase components of the short-time Fourier transform (STFT) of microphones to estimate the optimal ratio masks, and these masks are used to get the SCMs for constructing MVDR beamformer. Experiments are conducted by using simulated data. The results show that the proposed method is more robust than the reference methods against the terrible acoustic conditions.
关键词:
通讯作者信息:
来源 :
CONFERENCE PROCEEDINGS OF 2019 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2019)
年份: 2019
语种: 英文
归属院系: