Super resolution pitch detection based on band-partitioning spectral entropy and signal decomposition in DCT domain - Details

Author：

Luo, Ya-Fei (Luo, Ya-Fei.) | Bao, Chang-Chun (Bao, Chang-Chun.) (Scholars：鲍长春)

Indexed by：

EI Scopus PKU CSCD

Abstract：

In　this　paper,　the　research　focuses　on　pitch　detection　techniques　of　the　low-rate　WI　speech　coding.　As　the　pitch　doubling　and　halving　problems　of　pitch　detection　often　occurred　with　varied　noises　and　Signal　to　Noise　Ratio　(SNR),　voice　activity　detection　(VAD)　algorithm　based　on　DCT　band-partitioning　spectral　entropy　is　employed　in　pre-processing　to　separate　speech　and　non-speech　segments.　In　order　to　provide　an　accurate-pitch-cycle　speech　for　pith　detection　algorithm,　an　improved　speech　decomposition　algorithm　in　DCT　domain　based　on　the　Harmonic-Noise　Model　is　presented.　Then,　using　the　same　characteristic　of　maximum　peaks　of　MCAMDF　and　NCCF　and　two　pro-processing　techniques　mentioned　above,　a　pitch　detection　algorithm　in　a　combination　both　of　two　functions　together　named　MCAMDF-NCCF　is　proposed.　In　order　to　satisfy　the　needs　of　the　pitch　accuracy　of　WI　coder　and　synthesize　phase　track　correctly,　a　super　resolution　pitch　detection　algorithm　named　MCAMDF-NCCF-FRAC　based　on　MCAMDF-NCCF　is　also　given　to　get　fractional　pitch.　We　applied　these　algorithms　to　WI　coder,　the　results　from　the　subjective　A/B　listening　test　indicated　that　both　of　these　two　algorithms　have　a　great　performance　and　heavily　reduce　pitch　doubling　and　halving　and　voiced-unvoiced　error　in　low　SNR,　the　quality　of　the　synthesized　speech　satisfies　the　accuracy　of　the　pitch　detection　techniques　of　WI　coder　completely.

Keyword：

Speech processing Speech Signal to noise ratio Discrete cosine transforms Speech coding Algorithms Computational complexity

Author Community：

[ 1 ] [Luo, Ya-Fei]School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing 100022, China
[ 2 ] [Bao, Chang-Chun]School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing 100022, China

Reprint Author's Address：

Email：

luoyafei@emails.bjut.edu.cn

Show more details

Related Keywords：

A method for voiced/unvoiced/silence classification of speech with noise using SVM
2006，Acta Electronica Sinica
8-64kbit/s super-wideband embedded speech and audio coding algorithm
2009，Journal on Communications
A 2kb/s enhanced waveform interpolation speech coder
2004，2004 7th International Conference on Signal Processing Proceedings, ICSP
Low bit rates waveform interpolation speech coding based on singular value decomposition
2006，Acta Electronica Sinica

Source ：

Acta Electronica Sinica

ISSN： 0372-2112

Year： 2007

Issue： 1

Volume： 35

Page： 13-22

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

信息学部

Get Fulltext

Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to