收录:
摘要:
A Gaussian Mixture Model (GMM) based speech enhancement method in compressed domain used for ITU-T G. 722.2 wideband speech codec is proposed to take full advantage of the prior knowledge of the Immittance Spectral Frequencies (ISFs) for the clean speech. Firstly, GMM is adopted to model the joint probability density of feature vectors which are composed by the ISFs of noisy speech and clean speech with the corresponding gain scaling factor. Secondly, an optimal Bayesian estimation of feature parameters derived from clean speech is obtained under the minimum mean square error (MMSE) criterion. To be compatible with the DTX (Discontinuous Transmission) mode, the logarithmic energy is attenuated and the ISFs remain when a SID (Silence Insertion Descriptor) frame is received. Furthermore, if ao erased frame is received, the bit stream is unchanged and the proposed method is performed on the recovered parameters for the memory update. The evaluation is conducted under the ITU-T G. 160. The results indicate that, comparing with the reference method, the proposed method can produce larger amount of noise level reduction with better objective speech quality, while the SNR improvement remains acceptable.
关键词:
通讯作者信息:
电子邮件地址: