Masking Estimation with Phase Restoration of Clean Speech for Monaural Speech Enhancement - Details

Author：

Wang, Xianyun (Wang, Xianyun.) | Bao, Changchun (Bao, Changchun.)

Indexed by：

CPCI-S EI Scopus

Abstract：

Deep　neural　network　(DNN)　has　become　a　popular　means　for　separating　target　speech　from　noisy　speech　due　to　its　good　performance　for　learning　a　mapping　relationship　between　the　training　target　and　noisy　speech.　For　the　DNN-based　methods,　the　time-frequency　(T-F)　mask　commonly　used　as　the　training　target　has　a　significant　impact　on　the　performance　of　speech　restoration.　However,　the　T-F　mask　generally　modifies　magnitude　spectrum　of　noisy　speech　and　leaves　phase　spectrum　unchanged　in　enhancing　process.　The　recent　studies　have　revealed　that　incorporating　phase　spectrum　information　into　the　T-F　mask　can　effectively　improve　perceptual　quality　of　the　enhanced　speech.　So,　in　this　paper,　we　present　two　T-F　masks　to　simultaneously　enhance　magnitude　and　phase　of　speech　spectrum　based　on　non-correlation　assumption　of　real　part　and　imaginary　part　about　speech　spectrum,　and　use　them　as　the　training　target　of　the　DNN　model.　Experimental　results　show　that,　in　comparison　with　the　reference　methods,　the　proposed　method　can　obtain　an　effective　improvement　in　speech　quality　for　different　signal　to　noise　ratio　(SNR)　conditions.

Keyword：

time-frequency mask Speech enhancement DNN Phase restoration

Author Community：

[ 1 ] [Wang, Xianyun]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China
[ 2 ] [Bao, Changchun]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China

Reprint Author's Address：

Email：

b201402001@emails.bjut.edu.cn |
baochch@bjut.edu.cn

Show more details

Related Keywords：

Beamforming-based Speech Enhancement based on Optimal Ratio Mask
2019，IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC)
Joint Ideal Ratio Mask and Generative Adversarial Networks for Monaural Speech Enhancement
2018，14th IEEE International Conference on Signal Processing (ICSP)
Joint ideal ratio mask and generative adversarial networks for monaural speech enhancement
2018，14th IEEE International Conference on Signal Processing, ICSP 2018
Phase Unwrapping Based Speech Enhancement
2019，Annual Summit and Conference of the Asia-Pacific-Signal-and-Information-Processing-Association (APSIPA ASC)

Source ：

INTERSPEECH 2019

ISSN： 2308-457X

Year： 2019

Page： 3188-3192

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count： 2

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 0

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to