• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Jia, Maoshen (Jia, Maoshen.) | Wu, Yuxuan (Wu, Yuxuan.) | Bao, Changchun (Bao, Changchun.) (Scholars:鲍长春) | Ritz, Christian (Ritz, Christian.)

Indexed by:

EI Scopus SCIE

Abstract:

In this article, the direction of arrival (DOA) estimation of multiple speech sources in reverberant environments is investigated based on the recording of a soundfield microphone. First, the recordings are analyzed in the time-frequency (T-F) domain to detect both "points" (single T-F points) and "regions" (multiple, adjacent T-F points) corresponding to a single source with low reverberation (known as low-reverberant-single-source (LRSS) points). Then, a LRSS point detection algorithm is proposed based on a joint dominance measure and instantaneous single-source point (SSP) identification. Following this, initial DOA estimates obtained for the detected LRSS points are analyzed using a Gaussian Mixture Model (GMM) derived by the Expectation-Maximization (EM) algorithm to cluster components into sources or outliers using a rule-based method. Finally, the DOA of each actual source is obtained from the estimated source components. Experiments on both simulated data and data recorded in an actual acoustic chamber demonstrate that the proposed algorithm exhibits improved performance for the DOA estimation in reverberant environments when compared to several existing approaches.

Keyword:

LRSS point Reverberation Reflection reverberant environments Speech processing DOA estimation Microphone arrays Time-frequency analysis Estimation Direction-of-arrival estimation

Author Community:

  • [ 1 ] [Jia, Maoshen]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 2 ] [Wu, Yuxuan]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 3 ] [Bao, Changchun]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 4 ] [Ritz, Christian]Univ Wollongong, Sch Elect Comp & Telecommun Engn, Wollongong, NSW 2500, Australia

Reprint Author's Address:

  • 鲍长春

    [Jia, Maoshen]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China;;[Bao, Changchun]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China

Show more details

Related Keywords:

Source :

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

ISSN: 2329-9290

Year: 2021

Volume: 29

Page: 379-392

5 . 4 0 0

JCR@2022

ESI Discipline: ENGINEERING;

ESI HC Threshold:87

JCR Journal Grade:1

Cited Count:

WoS CC Cited Count: 17

SCOPUS Cited Count: 21

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 1

Affiliated Colleges:

Online/Total:762/5314218
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.