收录:
摘要:
A discriminative deep belief network (DDBN) based on the Fisher criterion is used here to calculate the super-vector feature space of speech signals. The network extracts the feature codebook of the speaker that is superior to the one from the traditional deep belief network (DBN) algorithm for multi-speaker clustering and segmentation. Evaluations on the multi-speaker audio stream corpus generated from the TIMIT database show that the speaker segmentation algorithm based on the DDBN with the Fisher criterion performs better than the traditional Bayesian information criterion (BIC) method and the DBN method.
关键词:
通讯作者信息:
电子邮件地址:
来源 :
Journal of Tsinghua University
ISSN: 1000-0054
年份: 2013
期: 6
卷: 53
页码: 804-807,812
归属院系: