An Identification Method of Question Subjects Based on Word Embedding and LSTM - Details

Author：

Gao, M.X. (Gao, M.X..) | Fu, Z.X. (Fu, Z.X..)

Indexed by：

EI Scopus

Abstract：

Using　the　subject　of　the　question　can　locate　the　question　area,　narrow　the　scope　of　the　query,　and　provide　users　with　better　answers.　The　question　text　is　usually　short　text.　Therefore,　in　view　of　its　sparse　features　and　irregular　structure,　this　paper　proposes　an　identification　method　of　question　subjects　based　on　word　embedding　and　LSTM　(IQS-WE-L),　and　uses　question　set　on　the　MadSci　website　for　experimentation,　which　has　three　subjects.　We　firstly　use　the　Word2vec　to　train　the　Wikipedia　database　to　generate　a　dictionary.　Then　based　on　word　vectors,　we　propose　four　feature　extraction　methods:　W2V,　W2V-TFIDF,　W2V-c-TFIDF　and　W2V-c,　which　formalizes　the　text　features　into　vectors　through　word　embedding　and　other　features.　Finally,　we　build　an　LSTM　network　for　classification　training　to　identify　the　subject　of　the　question　and　quantitative　evaluate　effect　of　four　feature　extraction　methods　we　proposed.　Experimental　data　shows　that　the　method　proposed　in　this　paper　can　effectively　identify　the　subject　of　the　question.　When　classifying　the　subject　of　the　question,　the　F1　value　can　reach　a　maximum　of　0.9339.　©　Published　under　licence　by　IOP　Publishing　Ltd.

Keyword：

Feature extraction Query processing Long short-term memory Embeddings Extraction

Author Community：

[ 1 ] [Gao, M.X.]Department of Information Science, Beijing University of Technology, Pingleyuan 100, Chaoyang District, Beijing, China
[ 2 ] [Fu, Z.X.]Department of Information Science, Beijing University of Technology, Pingleyuan 100, Chaoyang District, Beijing, China

Reprint Author's Address：

[gao, m.x.]department of information science, beijing university of technology, pingleyuan 100, chaoyang district, beijing, china

Email：

gaomx@bjut.edu.cn

Show more details

Related Keywords：

An Adversarial DBN-LSTM Method for Detecting and Defending against DDoS Attacks in SDN Environments
2023，Algorithms
An Integration Model Based on Graph Convolutional Network for Text Classification
2020，IEEE ACCESS
A Hybrid Prediction Method for Realistic Network Traffic With Temporal Convolutional Network and LSTM
2021，IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING
Physics-Informed Machine Learning for Degradation Modeling of an Electro-Hydrostatic Actuator System
2023，RELIABILITY ENGINEERING & SYSTEM SAFETY

Source ：

ISSN： 1742-6588

Year： 2020

Issue： 1

Volume： 1631

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 0

Affiliated Colleges：

学院待认领

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to