Multi-channel Speech Enhancement Based on the MVDR Beamformer and Postfilter - Details

Author：

Wang, Dujuan (Wang, Dujuan.) | Bao, Changchun (Bao, Changchun.) (Scholars：鲍长春)

Indexed by：

CPCI-S

Abstract：

Deep　neural　network　(DNN)　based　ideal　ratio　mask　(IRM)　estimation　methods　have　yielded　good　performance　in　monaural　speech　enhancement.　Meanwhile,　these　methods　have　also　shown　considerable　potential　for　beamforming　and　multichannel　speech　enhancement.　It　is　crucial　for　minimum　variance　distortionless　response　(MVDR)　beamformer　to　estimate　the　covariance　matrix　of　the　speech　and　noise　accurately.　The　accurate　estimation　of　time-frequency　(T-F)　mask　has　significant　impact　on　the　estimation　of　the　covariance　matrices.　So,　in　this　paper,　a　complex　real　and　imaginary　ratio　mask　(CRIRM)　based　MVDR　beamformer　for　speech　enhancement　using　residual　network　is　proposed.　First,　the　real　and　imaginary　masks　of　speech　and　noise　are　estimated　by　taking　advantage　of　a　residual　neural　network.　After　that,　the　estimations　of　speech　and　noise　are　obtained　by　using　the　estimated　masks.　Finally,　the　covariance　matrices　of　speech　and　noise　are　estimated,　and　applied　into　the　MVDR　beamformer.　In　addition,　in　order　to　further　reduce　residual　noise　interference,　the　output　of　the　MVDR　beamformer　is　further　processed　by　an　end-to-end　monaural　speech　enhancement　module.　Experiments　show　that,　the　proposed　method　can　better　improve　the　quality　and　intelligibility　of　the　enhanced　speech.

Keyword：

residual neural network postfilter speech enhancement beamforming real and imaginary masks

Author Community：

[ 1 ] [Wang, Dujuan]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing, Peoples R China
[ 2 ] [Bao, Changchun]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing, Peoples R China

Reprint Author's Address：

[Wang, Dujuan]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing, Peoples R China

Email：

wangdujuan@emails.bjut.edu.cn |
baochch@bjut.edu.cn

Show more details

Related Keywords：

Multi-channel Speech Enhancement with Multiple-target GANs
2020，10th IEEE International Conference on Signal Processing, Communications and Computing (IEEE ICSPCC)
Beamforming-based Speech Enhancement based on Optimal Ratio Mask
2019，IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC)
GEV Beamforming with BAN Integrating LPS Estimation and Post-filtering
2020，10th IEEE International Conference on Signal Processing, Communications and Computing (IEEE ICSPCC)
Speech Enhancement Based on Binaural Sound Source Localization and Cosh Measure Wiener Filtering
2021，CIRCUITS SYSTEMS AND SIGNAL PROCESSING

Source ：

2020 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2020)

Year： 2020

Language： English

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 3

Affiliated Colleges：

信息学部

Get Fulltext

Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to