收录:
摘要:
Named Entity Recognition (NER) is a basic task of natural language processing and an indispensable part of machine translation, knowledge mapping and other fields. In this paper, a fusion model of Chinese named entity recognition using BERT, Bidirectional LSTM (BiLSTM) and Conditional Random Field (CRF) is proposed. In this model, Chinese BERT generates word vectors as a word embedding model. Word vectors through BiLSTM can learn the word label distribution. Finally, the model uses Conditional Random Fields to make syntactic restrictions at the sentence level to get annotation sequences. In addition, we can use Whole Word Masking (wwm) instead of the original random mask in BERT's pre-training, which can effectively solve the problem that the word in Chinese NER is partly masked, so as to improve the performance of NER model. In this paper, BERT-wwm (BERT-wwm is the BERT that uses Whole-Word-Masking in pre training tasks), BERT, ELMo and Word2Vec are respectively used for comparative experiments to reflect the effect of bert-wwm in this fusion model. The results show that using Chinese BERT-wwm as the language representation model of NER model has better recognition ability. © 2020 ACM.
关键词:
通讯作者信息:
电子邮件地址: