Biomedical-domain pre-trained language model for extractive summarization - Details

Author：

Du, Yongping (Du, Yongping.) (Scholars：杜永萍) | Li, Qingxiao (Li, Qingxiao.) | Wang, Lulin (Wang, Lulin.) | He, Yanqing (He, Yanqing.)

Indexed by：

EI Scopus SCIE

Abstract：

In　recent　years,　the　performance　of　deep　neural　network　in　extractive　summarization　task　has　been　improved　significantly　compared　with　traditional　methods.　However,　in　the　field　of　biomedical　extractive　summarization,　existing　methods　cannot　make　good　use　of　the　domain-aware　external　knowledge;　furthermore,　the　document　structural　feature　is　omitted　by　existing　deep　neural　network　model.　In　this　paper,　we　propose　a　novel　model　called　BioBERTSum　to　better　capture　token-level　and　sentence-level　contextual　representation,　which　uses　a　domain-aware　bidirectional　language　model　pre-trained　on　large-scale　biomedical　corpora　as　encoder,　and　further　fine-tunes　the　language　model　for　extractive　text　summarization　task　on　single　biomedical　document.　Especially,　we　adopt　a　sentence　position　embedding　mechanism,　which　enables　the　model　to　learn　the　position　information　of　sentences　and　achieve　the　structural　feature　of　document.　To　the　best　of　our　knowledge,　this　is　the　first　work　to　use　the　pre-trained　language　model　and　fine-tuning　strategy　for　extractive　summarization　task　in　the　biomedical　domain.　Experiments　on　PubMed　dataset　show　that　our　proposed　model　outperforms　the　recent　SOTA　(state-of-the-art)　model　by　ROUGE-1/2/L.　(C)　2020　Elsevier　B.V.　All　rights　reserved.

Keyword：

Document representation Pre-trained language model Fine-tuning Extractive biomedical summarization

Author Community：

[ 1 ] [Du, Yongping]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[ 2 ] [Li, Qingxiao]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[ 3 ] [Wang, Lulin]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[ 4 ] [He, Yanqing]Inst Sci & Tech Informat China, Beijing 100038, Peoples R China

Reprint Author's Address：

[He, Yanqing]Inst Sci & Tech Informat China, Beijing 100038, Peoples R China

Email：

ypdu@bjut.edu.cn |
lqx_bjut@163.com |
linwang2048@163.com |
heyanqingbjut@163.com

Show more details

Related Keywords：

CMed-GPT: Prompt Tuning for Entity-Aware Chinese Medical Dialogue Generation
2024，ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II, PAKDD 2024
Prompt template construction by Average Gradient Search with External Knowledge for aspect sentimental analysis
2023，EXPERT SYSTEMS WITH APPLICATIONS
Chinese Electronic Medical Record Retrieval Method Using Fine-Tuned RoBERTa and Hybrid Features
2022，5th International Conference on Innovative Computing, IC 2022
Research on Relation Extraction Method of Chinese Electronic Medical Records Based on BERT
2020，6th International Conference on Computing and Artificial Intelligence, ICCAI 2020

Source ：

KNOWLEDGE-BASED SYSTEMS

ISSN： 0950-7051

Year： 2020

Volume： 199

8 . 8 0 0

JCR@2022

ESI Discipline： COMPUTER SCIENCE;

ESI HC Threshold：132

Cited Count：

WoS CC Cited Count： 38

SCOPUS Cited Count： 57

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 4

Affiliated Colleges：

信息学部

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to