收录:
摘要:
Multi-channel speech coding is an indispensable technology in the field of multi-input multi-output (MIMO) speech interaction. With the development of microphone array signal processing technology, the multi-channel speech coding has been paid more attention. In this paper, a multi-channel speech coding method is proposed based on linear microphone array, which combines the advantages of source speech codec and spatial information of microphone array. At the encoder, the open source Speex encoder is employed to encode speech signal from reference channel. The Inter-Channel Level Difference (ICLD) and Inter-Channel Time Difference (ICTD) are used as the spatial information of speech source and coded together. Considering the auditory characteristics of the human, the ICLD and ICTD are extracted in each sub-band divided by the Gammatone filter. At the decoder, the decoded speech signal of reference channel and the decoded ICLD and ICTD are used to reconstruct speech signals of all channels. The reconstructed speech based on this approach show a higher perceptual quality than the classical methods according to objective evaluation scores. Moreover, the experimental results confirmed that the proposed method can reduce bit rates while preserving speech quality and spatial information as much as possible.
关键词:
通讯作者信息: