Sparse DNN-based speaker segmentation using side information - Details

Author：

Ma, Yong (Ma, Yong.) | Bao, Chang-Chun (Bao, Chang-Chun.) (Scholars：鲍长春)

Indexed by：

EI Scopus SCIE

Abstract：

Sparse　deep　neural　networks　(SDNNs)　for　speaker　segmentation　are　proposed.　First,　the　SDNNs　are　trained　using　the　side　information　that　is　the　class　label　of　the　input.　Then,　speaker-specific　features　are　extracted　from　the　super-vector　feature　of　the　speech　signal　by　the　SDNNs.　Lastly,　the　label　of　each　speech　frame　is　obtained　by　K-means　clustering,　which　is　used　to　segment　different　speakers　of　a　continuous　speech　stream.　The　performance　evaluation　using　the　multi-speaker　speech　stream　corpus　generated　from　the　TIMIT　database　shows　that　the　proposed　speaker　segmentation　algorithm　outperforms　the　Bayesian　information　criterion　method　and　the　deep　auto-encoder　networks　method.

Keyword：

Author Community：

[ 1 ] [Ma, Yong]Beijing Univ Technol, Speech & Audio Signal Proc Lab, Sch Elect Informat & Control Engn, Beijing 100124, Peoples R China
[ 2 ] [Bao, Chang-Chun]Beijing Univ Technol, Speech & Audio Signal Proc Lab, Sch Elect Informat & Control Engn, Beijing 100124, Peoples R China
[ 3 ] [Ma, Yong]Jiangsu Normal Univ, Sch Phys & Elect Engn, Xuzhou, Peoples R China

Reprint Author's Address：

鲍长春
[Bao, Chang-Chun]Beijing Univ Technol, Speech & Audio Signal Proc Lab, Sch Elect Informat & Control Engn, Beijing 100124, Peoples R China

Email：

baochch@bjut.edu.cn

Show more details

Related Keywords：

Source ：

ELECTRONICS LETTERS

ISSN： 0013-5194

Year： 2015

Issue： 8

Volume： 51

1 . 1 0 0

JCR@2022

ESI Discipline： ENGINEERING;

ESI HC Threshold：174

JCR Journal Grade：3

CAS Journal Grade：4

Cited Count：

WoS CC Cited Count： 1

SCOPUS Cited Count： 1

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 0

Affiliated Colleges：

信息学部

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to