• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Du, Jinlian (Du, Jinlian.) | Mi, Wei (Mi, Wei.) | Du, Xiaolin (Du, Xiaolin.)

收录:

CPCI-S EI Scopus

摘要:

Electronic medical record (EMR) text word segmentation is the basis of natural language processing in medicine. Due to the characteristics of EMR, such as strong specialization, high cost of annotation, special writing style and sustained growth of terminology, the current Chinese word segmentation (CWS) methods cannot fully meet the requirements of the application of EMR. In order to solve this problem, an EMR word segmentation model based on Graph Neural Network (GNN), bidirectional Long Short-Term Memory network (Bi-LSTM) and conditional random field (CRF) is designed in this paper to improve the segmentation effect and reduce the dependence on data set. In the model, GNN based on the domain lexicon is used to learn the local composition features, Bi-LSTM is used to capture the long-term dependence and context sequence information, and CRF is used to obtain the optimal annotation sequence based on the sentence level label information. Through multi-feature interaction, the ambiguity resolution and new word recognition in the EMR word segmentation are effectively carried out. Compared with CWS tools such as Jieba and Pkuseg, as well as baseline models and state-of-the-art methods, the precision and recall rate of the model in this paper have been significantly improved.

关键词:

Deep Learning CWS GNN EMR

作者机构:

  • [ 1 ] [Du, Jinlian]Beijing Univ Technol, Dept Informat, Beijing, Peoples R China
  • [ 2 ] [Mi, Wei]Beijing Univ Technol, Dept Informat, Beijing, Peoples R China
  • [ 3 ] [Du, Xiaolin]Beijing Univ Technol, Dept Informat, Beijing, Peoples R China

通讯作者信息:

  • [Mi, Wei]Beijing Univ Technol, Dept Informat, Beijing, Peoples R China

查看成果更多字段

相关关键词:

相关文章:

来源 :

2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE

ISSN: 2156-1125

年份: 2020

页码: 985-989

语种: 英文

被引次数:

WoS核心集被引频次: 6

SCOPUS被引频次: 8

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 0

归属院系:

在线人数/总访问数:157/4512760
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司