Indexed by:
Abstract:
Sparse deep neural networks (SDNNs) for speaker segmentation are proposed. First, the SDNNs are trained using the side information that is the class label of the input. Then, speaker-specific features are extracted from the super-vector feature of the speech signal by the SDNNs. Lastly, the label of each speech frame is obtained by K-means clustering, which is used to segment different speakers of a continuous speech stream. The performance evaluation using the multi-speaker speech stream corpus generated from the TIMIT database shows that the proposed speaker segmentation algorithm outperforms the Bayesian information criterion method and the deep auto-encoder networks method.
Keyword:
Reprint Author's Address:
Email:
Source :
ELECTRONICS LETTERS
ISSN: 0013-5194
Year: 2015
Issue: 8
Volume: 51
1 . 1 0 0
JCR@2022
ESI Discipline: ENGINEERING;
ESI HC Threshold:174
JCR Journal Grade:3
CAS Journal Grade:4
Cited Count:
WoS CC Cited Count: 1
SCOPUS Cited Count: 1
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: