Pronunciation quality evaluation approach based on bimodal fusion with noise adaptive weight - Details

Author：

Jia, Xibin (Jia, Xibin.) (Scholars：贾熹滨) | Zhang, Kewei (Zhang, Kewei.) | Han, Yanfang (Han, Yanfang.) | Powers, David (Powers, David.)

Indexed by：

EI Scopus

Abstract：

Facing　the　requirement　of　the　virtual　pedagogy　application　to　have　the　ability　of　evaluating　English　learners＇　pronunciation　quality,　the　paper　proposes　an　automatic　assessment　method　based　on　a　bimodal　fusion　decision　algorithm.　The　pronunciation　level　is　scored　by　comparing　the　similarity　between　learner　and　standard＇s　audio　and　video　speech　signals　separately.　The　final　score　of　the　learner＇s　pronunciation　is　gotten　by　fusing　the　above　scores　with　the　linear　weighting　combination　approach.　Referring　to　the　knowledge　that　the　visual　speech　can　aid　the　audio　to　improve　the　human　perception　especially　under　noisy　environments,　the　paper　proposes　a　noise　adaptive　weighting　strategy　in　fusing　process.　To　solve　the　problem　of　disagreement　of　speech　length　due　to　the　various　speaking　speed,　the　paper　adopts　the　dynamic　warping　algorithm　to　do　the　time　alignment　between　the　test　speeches　and　the　standard　ones.　The　data　selected　from　the　Australia　audio　and　visual　speech　corpus　(AVOZES)　is　employed　to　test　the　performance　of　our　automatic　evaluating　system.　The　experiment　result　shows　that　audio　and　visual　speech　fusion　approach　improves　the　rationality　of　automatic　pronunciation　accessing　system　by　making　full　use　of　correlative　and　complementary　information　between　acoustic　and　visual　speech　comparing　to　the　audio-speech-only　evaluation　results.　©　2012　AICIT.

Keyword：

Quality control Acoustic noise Speech Audio acoustics

Author Community：

[ 1 ] [Jia, Xibin]Multimedia and Intelligent Software Technology, Beijing Municipal Key Laboratory, Beijing University of Technology, Beijing, China
[ 2 ] [Zhang, Kewei]Multimedia and Intelligent Software Technology, Beijing Municipal Key Laboratory, Beijing University of Technology, Beijing, China
[ 3 ] [Han, Yanfang]Multimedia and Intelligent Software Technology, Beijing Municipal Key Laboratory, Beijing University of Technology, Beijing, China
[ 4 ] [Powers, David]School of Computer Science, Engineering and Maths, Beijing University of Technology, Beijing, China
[ 5 ] [Powers, David]School of Computer Science, Engineering and Maths, Flinders University of South Australia, Adelaide, Australia