Indexed by:
Abstract:
Multi-channel speech coding is an indispensable technology in the field of multi-input multi-output (MIMO) speech interaction. With the development of microphone array signal processing technology, the multi-channel speech coding has been paid more attention. In this paper, a multi-channel speech coding method is proposed based on linear microphone array, which combines the advantages of source speech codec and spatial information of microphone array. At the encoder, the open source Speex encoder is employed to encode speech signal from reference channel. The Inter-Channel Level Difference (ICLD) and Inter-Channel Time Difference (ICTD) are used as the spatial information of speech source and coded together. Considering the auditory characteristics of the human, the ICLD and ICTD are extracted in each sub-band divided by the Gammatone filter. At the decoder, the decoded speech signal of reference channel and the decoded ICLD and ICTD are used to reconstruct speech signals of all channels. The reconstructed speech based on this approach show a higher perceptual quality than the classical methods according to objective evaluation scores. Moreover, the experimental results confirmed that the proposed method can reduce bit rates while preserving speech quality and spatial information as much as possible.
Keyword:
Reprint Author's Address:
Source :
PROCEEDINGS OF 2020 IEEE 15TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2020)
ISSN: 2164-5221
Year: 2020
Page: 136-140
Language: English
Cited Count:
WoS CC Cited Count: 1
SCOPUS Cited Count: 3
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: