Speech enhancement methods based on binaural cue coding - Details

Author：

Wang, Xianyun (Wang, Xianyun.) | Bao, Changchun (Bao, Changchun.) (Scholars：鲍长春)

Indexed by：

EI Scopus SCIE

Abstract：

According　to　the　encoding　and　decoding　mechanism　of　binaural　cue　coding　(BCC),　in　this　paper,　the　speech　and　noise　are　considered　as　left　channel　signal　and　right　channel　signal　of　the　BCC　framework,　respectively.　Subsequently,　the　speech　signal　is　estimated　from　noisy　speech　when　the　inter-channel　level　difference　(ICLD)　and　inter-channel　correlation　(ICC)　between　speech　and　noise　are　given.　In　this　paper,　exact　inter-channel　cues　and　the　pre-enhanced　inter-channel　cues　are　used　for　speech　restoration.　The　exact　inter-channel　cues　are　extracted　from　clean　speech　and　noise,　and　the　pre-enhanced　inter-channel　cues　are　extracted　from　the　pre-enhanced　speech　and　estimated　noise.　After　that,　they　are　combined　one　by　one　to　form　a　codebook.　Once　the　pre-enhanced　cues　are　extracted　from　noisy　speech,　the　exact　cues　are　estimated　by　a　mapping　between　the　pre-enhanced　cues　and　a　prior　codebook.　Next,　the　estimated　exact　cues　are　used　to　obtain　a　time-frequency　(T-F)　mask　for　enhancing　noisy　speech　based　on　the　decoding　of　BCC.　In　addition,　in　order　to　further　improve　accuracy　of　the　T-F　mask　based　on　the　inter-channel　cues,　the　deep　neural　network　(DNN)-based　method　is　proposed　to　learn　the　mapping　relationship　between　input　features　of　noisy　speech　and　the　T-F　masks.　Experimental　results　show　that　the　codebook-driven　method　can　achieve　better　performance　than　conventional　methods,　and　the　DNN-based　method　performs　better　than　the　codebook-driven　method.

Keyword：

Codebook Binaural cue coding Monaural speech enhancement Deep neural network

Author Community：

[ 1 ] [Wang, Xianyun]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China
[ 2 ] [Bao, Changchun]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China

Reprint Author's Address：

鲍长春
[Bao, Changchun]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China

Email：

b201402001@emails.bjut.edu.cn

Show more details

Related Keywords：

Codebook-Based Speech Enhancement Using Markov Process and Speech-presence Probability
2015，16th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2015)
Multi-channel Speech Enhancement Based on the MVDR Beamformer and Postfilter
2020，10th IEEE International Conference on Signal Processing, Communications and Computing (IEEE ICSPCC)
Speech Enhancement based on Binaural Cues
2017，9th Annual Summit and Conference of the Asia-Pacific-Signal-and-Information-Processing-Association (APSIPA ASC)
Joint Ideal Ratio Mask and Generative Adversarial Networks for Monaural Speech Enhancement
2018，14th IEEE International Conference on Signal Processing (ICSP)

Source ：

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING

ISSN： 1687-4722

Year： 2019

Issue： 1

Volume： 2019

2 . 4 0 0

JCR@2022

ESI Discipline： ENGINEERING;

ESI HC Threshold：136

JCR Journal Grade：3

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 1

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 2

Affiliated Colleges：

信息学部

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to