• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Jia, Maoshen (Jia, Maoshen.) | Sun, Jundai (Sun, Jundai.) | Zheng, Xiguang (Zheng, Xiguang.)

收录:

Scopus SCIE

摘要:

In this work, a multiple speech source separation method using inter-channel correlation and relaxed sparsity is proposed. A B-format microphone with four spatially located channels is adopted due to the size of the microphone array to preserve the spatial parameter integrity of the original signal. Specifically, we firstly measure the proportion of overlapped components among multiple sources and find that there exist many overlapped time-frequency (TF) components with increasing source number. Then, considering the relaxed sparsity of speech sources, we propose a dynamic threshold-based separation approach of sparse components where the threshold is determined by the inter-channel correlation among the recording signals. After conducting a statistical analysis of the number of active sources at each TF instant, a form of relaxed sparsity called the half-K assumption is proposed so that the active source number in a certain TF bin does not exceed half the total number of simultaneously occurring sources. By applying the half-K assumption, the non-sparse components are recovered by regarding the extracted sparse components as a guide, combined with vector decomposition and matrix factorization. Eventually, the final TF coefficients of each source are recovered by the synthesis of sparse and non-sparse components. The proposed method has been evaluated using up to six simultaneous speech sources under both anechoic and reverberant conditions. Both objective and subjective evaluations validated that the perceptual quality of the separated speech by the proposed approach outperforms existing blind source separation (BSS) approaches. Besides, it is robust to different speeches whilst confirming all the separated speeches with similar perceptual quality.

关键词:

sparsity multiple speech source separation B-format microphone

作者机构:

  • [ 1 ] [Jia, Maoshen]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Computat Intelligence & Intellige, Beijing 100124, Peoples R China
  • [ 2 ] [Sun, Jundai]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Computat Intelligence & Intellige, Beijing 100124, Peoples R China
  • [ 3 ] [Zheng, Xiguang]Univ Wollongong, Fac Engn & Informat Sci, Wollongong, NSW 2522, Australia
  • [ 4 ] [Jia, Maoshen]Beijing Univ Technol, 100 Pingleyuan, Beijing, Peoples R China
  • [ 5 ] [Sun, Jundai]Beijing Univ Technol, 100 Pingleyuan, Beijing, Peoples R China

通讯作者信息:

  • [Jia, Maoshen]Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Computat Intelligence & Intellige, Beijing 100124, Peoples R China;;[Jia, Maoshen]Beijing Univ Technol, 100 Pingleyuan, Beijing, Peoples R China

查看成果更多字段

相关关键词:

来源 :

APPLIED SCIENCES-BASEL

ISSN: 2076-3417

年份: 2018

期: 1

卷: 8

2 . 7 0 0

JCR@2022

ESI学科: ENGINEERING;

ESI高被引阀值:156

JCR分区:3

被引次数:

WoS核心集被引频次: 1

SCOPUS被引频次: 2

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 1

在线人数/总访问数:178/4608922
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司