收录:
摘要:
The wide adoption of electronic medical record (EMR) systems causes rapid growth of medical and clinical data. It makes the medical named entity recognition (NER) technologies become critical to find useful patient information in the medical dataset. However, the medical terminologies usually have the characteristics of inherent complexity and ambiguity, it is difficult to capture context-dependency representations by supervision signal from a simple single layer structure model. In order to address this problem, this paper proposes a hybrid model based on stacked Bidirectional Long Short-Term Memory (BILSTM) for medical named entity recognition, which we call BSBC (BERT combined with stacked BILSTM and CRF). First, we use Bidirectional Encoder Representation from Transformers (BERT) to perform unsupervised learning on an unlabeled dataset to obtain character-level embeddings. Then, stacked BILSTM is utilized to obtain context-dependency representations through the multi hidden layers structure. Finally, Conditional Random Field (CRF) is used to predict sequence tags. The experiment results show that our method significantly outperforms the baseline methods, it serves as a strong alternative approach compared with traditional methods.
关键词:
通讯作者信息:
电子邮件地址: