Multi-Channel Speech Coding Combining Spatial Information and Speex Codec - Details

Author：

Ji, Qiang (Ji, Qiang.) | Bao, Changchun (Bao, Changchun.) (Scholars：鲍长春)

Indexed by：

CPCI-S EI Scopus

Abstract：

Multi-channel　speech　coding　is　an　indispensable　technology　in　the　field　of　multi-input　multi-output　(MIMO)　speech　interaction.　With　the　development　of　microphone　array　signal　processing　technology,　the　multi-channel　speech　coding　has　been　paid　more　attention.　In　this　paper,　a　multi-channel　speech　coding　method　is　proposed　based　on　linear　microphone　array,　which　combines　the　advantages　of　source　speech　codec　and　spatial　information　of　microphone　array.　At　the　encoder,　the　open　source　Speex　encoder　is　employed　to　encode　speech　signal　from　reference　channel.　The　Inter-Channel　Level　Difference　(ICLD)　and　Inter-Channel　Time　Difference　(ICTD)　are　used　as　the　spatial　information　of　speech　source　and　coded　together.　Considering　the　auditory　characteristics　of　the　human,　the　ICLD　and　ICTD　are　extracted　in　each　sub-band　divided　by　the　Gammatone　filter.　At　the　decoder,　the　decoded　speech　signal　of　reference　channel　and　the　decoded　ICLD　and　ICTD　are　used　to　reconstruct　speech　signals　of　all　channels.　The　reconstructed　speech　based　on　this　approach　show　a　higher　perceptual　quality　than　the　classical　methods　according　to　objective　evaluation　scores.　Moreover,　the　experimental　results　confirmed　that　the　proposed　method　can　reduce　bit　rates　while　preserving　speech　quality　and　spatial　information　as　much　as　possible.

Keyword：

Speex Codec Gammatone filter microphone array spatial perception cues speech coding

Author Community：

[ 1 ] [Ji, Qiang]Beijing Univ Technol, Speech & Audio Signal Proc Lab, Fac Informat Technol, Beijing, Peoples R China
[ 2 ] [Bao, Changchun]Beijing Univ Technol, Speech & Audio Signal Proc Lab, Fac Informat Technol, Beijing, Peoples R China

Reprint Author's Address：

鲍长春
[Bao, Changchun]Beijing Univ Technol, Speech & Audio Signal Proc Lab, Fac Informat Technol, Beijing, Peoples R China

Email：

jigiang@emails.bjut.edu.cn |
baochch@bjut.edu.cn

Show more details

Related Keywords：

MASS: Microphone Array Speech Simulator in Room Acoustic Environment for Multi-Channel Speech Coding and Enhancement
2020，APPLIED SCIENCES-BASEL
A Review of Speech Coding
1998，通信学报
A New Parametric Coding Method Combined Linear Microphone Array Topology
2022，DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC)
DNN-based Multi-Channel Speech Coding Employing Sound Localization
2022，DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC)

Source ：

PROCEEDINGS OF 2020 IEEE 15TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2020)

ISSN： 2164-5221

Year： 2020

Page： 136-140

Language： English

Cited Count：

WoS CC Cited Count： 1

SCOPUS Cited Count： 4

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 4

Affiliated Colleges：

信息学部

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to