收录:
摘要:
Speech enhancement is the task of improving some perceptual aspects of noisy speech. Recently, Generative Adversarial Networks (GAN) is becoming a popular deep learning method and different GAN's structures have been proposed [1], [2]. In this paper, we propose a new framework for speech enhancement task by using GAN. We train two models: a generative model G and a discriminative model D. The G and D are both defined by the feedforward multilayer perceptions (MLPs) [3]. The difference between the generator and the discriminator is the generator G employs deep neural network (DNN) based on the masking technique in which the magnitude spectrum of noise and the magnitude spectrum of clean speech are estimated from noisy speech features simultaneously. Meanwhile, the discriminator D uses the MLPS structure to directly predict clean speech magnitude spectrum. The model D discriminates data that comes from clean speech or generated speech by G network. Moreover, in our work, G network is used to perform the speech enhancement. The objective evaluation and experimental results show that the proposed framework significantly improves the performance of traditional deep neural network (DNN) and recent GAN-based speech enhancement methods. © 2018 IEEE.
关键词:
通讯作者信息:
电子邮件地址: