• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Yang, Yan (Yang, Yan.) | Bao, Changchun (Bao, Changchun.) (Scholars:鲍长春)

Indexed by:

EI Scopus SCIE

Abstract:

By taking into account temporal correlation of speech feature. In this paper, a novel structure of convolutional Auto Encoder (CAE) was proposed. In this structure, the historical output of the CAE was fed into a CAE stack recurrently. We name this structure as Recurrent Stack Convolutional Auto Encoder (RS-CAE). In the training stage, the training feature maps of the RS-CAE comprise of log power spectrum (LPS) of noisy speech and an additional feature map derived from the LPS of the enhanced speech in the history. In this way, the temporal correlation is incorporated as much as possible in the RS-CAE. The training target is a concatenated vector of auto-regressive (AR) model parameters of speech and noise. At online stage, the LPS of noisy speech and the LPS of the enhanced speech from the history make up input feature maps together. The outputs of the RS-CAE are the AR model parameters of speech and noise, which are used to construct the AR-Wiener filter. Because the estimated AR model parameters are not completely accurate and some harmonics may be lost in the enhanced speech, the codebook-based harmonic recovery technique was proposed to reconstruct harmonic structure of the enhanced speech. The test results confirmed that the proposed method achieved better performance compared with some existing approaches.

Keyword:

Speech enhancement codebook-based harmonic recovery RS-CAE temporal correlation

Author Community:

  • [ 1 ] [Yang, Yan]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China
  • [ 2 ] [Bao, Changchun]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China

Reprint Author's Address:

  • 鲍长春

    [Bao, Changchun]Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China

Show more details

Related Keywords:

Related Article:

Source :

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

ISSN: 2329-9290

Year: 2019

Issue: 11

Volume: 27

Page: 1752-1762

5 . 4 0 0

JCR@2022

ESI Discipline: ENGINEERING;

ESI HC Threshold:136

JCR Journal Grade:1

Cited Count:

WoS CC Cited Count: 6

SCOPUS Cited Count: 11

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 0

Affiliated Colleges:

Online/Total:911/5320384
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.