Weighted-LDA-TVM: Using a Weighted Topic Vector Model for Measuring Short Text Similarity - Details

Author：

He, Xiaobo (He, Xiaobo.) | Zhong, Ning (Zhong, Ning.) | Chen, Jianhui (Chen, Jianhui.)

Indexed by：

Abstract：

Topic　modeling　is　the　core　task　of　the　similarity　measurement　of　short　texts　and　is　widely　used　in　the　fields　of　information　retrieval　and　sentiment　analysis.　Though　latent　dirichlet　allocation　provides　an　approach　to　model　texts　by　mining　the　underlying　semantic　themes　of　texts.　It　often　leads　to　a　low　accuracy　of　text　similarity　calculation　because　of　the　feature　sparseness　and　poor　topic　focus　of　short　texts.　This　paper　proposes　a　similarity　measurement　method　of　short　texts　based　on　a　new　topic　model,　namely　Weighted-LDA-TVM.　Latent　dirichlet　allocation　is　adopted　to　capture　the　latent　topics　of　short　texts.　The　topic　weights　are　learned　by　using　particle　swarm　optimization.　Finally,　a　text　vector　can　be　constructed　based　on　the　word　embeddings　of　weighted　topics　for　measuring　the　similarity　of　short　texts.　A　group　of　text　similarity　measurement　experiments　were　performed　on　biomedical　literature　abstracts　about　antidepressant　drugs.　The　experimental　results　prove　that　the　proposed　model　has　the　better　distinguish　ability　and　semantic　representation　ability　for　the　similarity　measurement　of　short　texts.　©　2019,　Springer　Nature　Switzerland　AG.

Keyword：

Particle swarm optimization (PSO) Text mining Semantics Embeddings Statistics Sentiment analysis

Author Community：

[ 1 ] [He, Xiaobo]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 2 ] [Zhong, Ning]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 3 ] [Zhong, Ning]Beijing International Collaboration Base on Brain Informatics and Wisdom Services, Beijing; 100124, China
[ 4 ] [Zhong, Ning]Beijing Key Laboratory of MRI and Brain Informatics, Beijing; 100124, China
[ 5 ] [Zhong, Ning]Department of Life Science and Informatics, Maebashi Institute of Technology, Maebashi; Gunma; 371-0816, Japan
[ 6 ] [Chen, Jianhui]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China
[ 7 ] [Chen, Jianhui]Beijing Advanced Innovation Center for Future Internet Technology, Beijing University of Technology, Beijing; 100124, China

Reprint Author's Address：

钟宁
[zhong, ning]faculty of information technology, beijing university of technology, beijing; 100124, china;;[zhong, ning]beijing key laboratory of mri and brain informatics, beijing; 100124, china;;[zhong, ning]beijing international collaboration base on brain informatics and wisdom services, beijing; 100124, china;;[zhong, ning]department of life science and informatics, maebashi institute of technology, maebashi; gunma; 371-0816, japan

Email：

zhong@maebashi-it.ac.jp

Show more details

Related Keywords：