Indexed by:
Abstract:
In this letter, a novel weighted mean square error (WMSE) is proposed to improve the DNN-based mask approximation method for speech enhancement, in which the weighting is closely related to the power exponent about noisy spectrum amplitude (NSA) base. The power exponents 0 and 2 separately reflect ideal amplitude masking (IAM) without any clippings and the indirect mapping (IM) on short-time spectral amplitude (STSA), and it is highly related to the enhanced spectrum and the performance of the enhanced signal based on the tests. Also, the experimental results show that the outstanding weighting is the noisy spectrum base with the power exponent 1 for the phase-unaware masking and results in better harmonic structure restoration. The objective function with the WMSE on the NSA (WMSE-NSA) can averagely improve 0.1 on the test of perceptual evaluation of speech quality (PESQ) and 1.7% on the test of short-time objective intelligibility (STOI) compared with the MSE-based mask approximation methods. © 1994-2012 IEEE.
Keyword:
Reprint Author's Address:
Email:
Source :
IEEE Signal Processing Letters
ISSN: 1070-9908
Year: 2021
Volume: 28
Page: 618-622
3 . 9 0 0
JCR@2022
ESI HC Threshold:87
JCR Journal Grade:2
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count: 4
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 2
Affiliated Colleges: