HMM-Based Speech Enhancement Using Vector Taylor Series and Parallel Modeling in Mel-Frequency Domain - Details

Author：

Gao, Zhen-zhen (Gao, Zhen-zhen.) | Bao, Chang-chun (Bao, Chang-chun.) (Scholars：鲍长春) | Bao, Feng (Bao, Feng.) | Jia, Mao-shen (Jia, Mao-shen.)

Indexed by：

CPCI-S

Abstract：

Speech　enhancement　based　on　hidden　Markov　model　(HMM)　and　the　minimum　mean　square　error　(MMSE)　criterion　in　Mel-frequency　domain　is　generally　considered　as　a　weighted-sum　filtering　of　the　noisy　speech.　The　weights　of　filters　are　often　estimated　by　the　HMM　of　noisy　speech,　and　the　estimation　of　filters　usually　requires　an　inverse　operation　from　the　Mel-frequency　to　the　spectral　domain　which　often　causes　spectral　distortion.　In　order　to　obtain　a　more　accurate　HMM　of　noisy　speech,　the　vector　Taylor　series　(VTS)　is　used　to　estimated　the　mean　vectors　and　covariance　matrices　of　HMM　for　noisy　speech.　To　reduce　the　distortion　derived　from　inversion　operation,　a　parallel　Mel-frequency　and　log-magnitude　(PMLM)　modeling　approach　is　proposed.　In　PMLM,　a　simultaneous　modeling　in　both　Mel-frequency　domain　and　log-magnitude　(LOG-MAG)　domain　is　performed　to　train　the　HMMs　of　the　clean　speech　and　noise.　Experimental　results　show　that,　in　comparison　with　the　reference　methods,　the　proposed　method　can　get　better　performance　for　different　noise　environments　and　input　SNRs.

Keyword：

speech enhancement parallel Mel-frequency and log-magnitude modeling vector Taylor series HMM

Author Community：

[ 1 ] [Gao, Zhen-zhen]Beijing Univ Technol, Sch Elect Informat & Control Engn, Speech & Audio Signal Proc Lab, Beijing, Peoples R China
[ 2 ] [Bao, Chang-chun]Beijing Univ Technol, Sch Elect Informat & Control Engn, Speech & Audio Signal Proc Lab, Beijing, Peoples R China
[ 3 ] [Bao, Feng]Beijing Univ Technol, Sch Elect Informat & Control Engn, Speech & Audio Signal Proc Lab, Beijing, Peoples R China
[ 4 ] [Jia, Mao-shen]Beijing Univ Technol, Sch Elect Informat & Control Engn, Speech & Audio Signal Proc Lab, Beijing, Peoples R China

Reprint Author's Address：

[Gao, Zhen-zhen]Beijing Univ Technol, Sch Elect Informat & Control Engn, Speech & Audio Signal Proc Lab, Beijing, Peoples R China

Email：

happyzhen123@emails.bjut.edu.cn |
baochch@bjut.edu.cn |
baofeng@emails.bjut.edu.cn |
jiamaoshen@bjut.edu.cn

Show more details

Related Keywords：

A Data-Driven Speech Enhancement Method Based on Modeled Long-Range Temporal Dynamics
2015，16th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2015)
HMM-Based Cue Parameters Estimation for Speech Enhancement
2016，10th International Symposium on Chinese Spoken Language Processing (ISCSLP)
Sparse Hidden Markov Models for Speech Enhancement in Non-Stationary Noise Environments
2015，IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
Compressed Domain Speech Enhancement based on the Joint Modification of Codebook Gains
2011，IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

Source ：

2014 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC)

Year： 2014

Page： 733-737

Language： English

Cited Count：

WoS CC Cited Count： 2

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 4

Affiliated Colleges：

信息学部

Get Fulltext

Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to