收录:
摘要:
Word segmentation is a basic topic in the field of natural language processing, and improving the accuracy of word segmentation is a key problem. With the popularity of microblog, accurate word segmentation for microblog text has become a hot spot. However, microblog texts often contain information about multiple related domains, ambiguous words in multi-domain will lead to the decline of word segmentation accuracy. Based on the model theory of word vector and branching entropy, this paper proposes a multi-domain global correlation degree branching entropy method for microblog text word segmentation. This model is applied to microblog text about house price topic in Beijing. The precision, recall and F-measure of this method are compared with branching entropy model proposed by Zhang[6], and the experimental results show that our method outperforms it. © 2020 ACM.
关键词:
通讯作者信息:
电子邮件地址:
来源 :
年份: 2020
页码: 71-75
语种: 英文
归属院系: