• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Gao, Mingxia (Gao, Mingxia.) | Lu, Jianguo (Lu, Jianguo.) | Chen, Furong (Chen, Furong.)

收录:

EI Scopus

摘要:

The aim of Medical Knowledge Graph Completion is to automatically predict one of three parts (head entity, relationship, and tail entity) in RDF triples from medical data, mainly text data. Following their introduction, the use of pretrained language models, such as Word2vec, BERT, and XLNET, to complete Medical Knowledge Graphs has become a popular research topic. The existing work focuses mainly on relationship completion and has rarely solved entities and related triples. In this paper, a framework to predict RDF triples for Medical Knowledge Graphs based on word embeddings (named PTMKG-WE) is proposed, for the specific use for the completion of entities and triples. The framework first formalizes existing samples for a given relationship from the Medical Knowledge Graph as prior knowledge. Second, it trains word embeddings from big medical data according to prior knowledge through Word2vec. Third, it can acquire candidate triples from word embeddings based on analogies from existing samples. In this framework, the paper proposes two strategies to improve the relation features. One is used to refine the relational semantics by clustering existing triple samples. Another is used to accurately embed the expression of the relationship through means of existing samples. These two strategies can be used separately (called PTMKG-WE-C and PTMKG-WE-M, respectively), and can also be superimposed (called PTMKG-WE-C-M) in the framework. Finally, in the current study, PubMed data and the National Drug File-Reference Terminology (NDF-RT) were collected, and a series of experiments was conducted. The experimental results show that the framework proposed in this paper and the two improvement strategies can be used to predict new triples for Medical Knowledge Graphs, when medical data are sufficiently abundant and the Knowledge Graph has appropriate prior knowledge. The two strategies designed to improve the relation features have a significant effect on the lifting precision, and the superposition effect becomes more obvious. Another conclusion is that, under the same parameter setting, the semantic precision of word embedding can be improved by extending the breadth and depth of data, and the precision of the prediction framework in this paper can be further improved in most cases. Thus, collecting and training big medical data is a viable method to learn more useful knowledge. © 2022 by the authors. Licensee MDPI, Basel, Switzerland.

关键词:

Semantics Resource Description Framework (RDF) Embeddings Forecasting Knowledge graph

作者机构:

  • [ 1 ] [Gao, Mingxia]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 2 ] [Lu, Jianguo]School of Computer Science, University of Windsor, Windsor; ON; N9B 3P4, Canada
  • [ 3 ] [Chen, Furong]TravelSky Technology Limited, Beijing; 101300, China

通讯作者信息:

电子邮件地址:

查看成果更多字段

相关关键词:

相关文章:

来源 :

Information (Switzerland)

年份: 2022

期: 4

卷: 13

被引次数:

WoS核心集被引频次:

SCOPUS被引频次: 9

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 6

归属院系:

在线人数/总访问数:318/4971097
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司