收录:
摘要:
Recently, deep learning techniques have significantly promoted the development of speech enhancement. In this paper, we propose a novel framework to conduct speech enhancement, which is based on the long short-term memory networks (LSTMs) and conditional generative adversarial networks (cGANs). This framework includes a generator (G) and a discriminator (D). G and D are both LSTMs so our method is able to be more suitable for speech enhancement task than previous deep neural network-based methods. In this study, we firstly apply this framework to map the log-power spectral (LPS) of clean speech given the noisy LPS input. In addition, this framework is also used to estimate the ideal Wiener filter by giving the noisy Cepstral input. Experimental results indicate that our strategy can not only improve the quality and intelligibility of noisy speech, but also is competitive to other deep learning-based approaches. © 2018 IEEE.
关键词:
通讯作者信息:
电子邮件地址: