收录:
摘要:
The paper aims to establish a effective feature form of visual speech to realize the Chinese viseme recognition. We propose and discuss a representation model of the visual speech which bases on the local binary pattern (LBP) and the discrete cosine transform (DCT) of mouth images. The joint model combines the advantages of the local and global texture information together, which shows better performance than using the global feature only. By computing LBP and DCT of each mouth frame capturing during the subject speaking, the Hidden Markov Model (HMM) is trained based on the training dataset and is employed to recognize the new visual speech. The experiments show this visual speech feature model exhibits good performance in classifying the difference speaking states.
关键词:
通讯作者信息:
电子邮件地址:
来源 :
AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION
ISSN: 1867-5662
年份: 2012
卷: 137
页码: 101-107
语种: 英文
归属院系: