Research on Text Classification Method Based on Word2vec and Improved TF-IDF - Details

Author：

Zhang, Tao (Zhang, Tao.) (Scholars：张涛) | Wang, LuYao (Wang, LuYao.)

Indexed by：

EI Scopus

Abstract：

TF-IDF　is　widely　used　as　the　most　common　feature　weight　calculation　method.　The　traditional　TF-IDF　feature　extraction　method　lacks　the　representation　of　the　distribution　difference　between　classes　in　the　text　classification　task　and　the　feature　matrix　generated　by　the　TF-IDF　is　huge　and　sparse.　Based　on　this　situation,　this　paper　proposes　a　method　of　using　the　feature　extraction　algorithm　of　chi-square　statistics　to　compensate　for　the　distribution　difference　between　classes　and　generating　a　fixed-dimensional　real　matrix　through　word2vec.　The　experimental　results　show　that　the　new　method　is　significantly　better　than　the　traditional　feature　extraction　methods　in　the　evaluation　results　such　as　precision,　recall,　F1　and　ROC_AUC.　©　2020,　Springer　Nature　Switzerland　AG.

Keyword：

Intelligent systems Feature extraction Text processing Extraction Classification (of information)

Author Community：

[ 1 ] [Zhang, Tao]School of Software, Beijing University of Technology, Beijing, China
[ 2 ] [Wang, LuYao]School of Software, Beijing University of Technology, Beijing, China

Reprint Author's Address：

[wang, luyao]school of software, beijing university of technology, beijing, china

Email：

358226756@qq.com

Show more details

Related Keywords：

A feature extraction method based on combined wavelets filter in speech recognition
2008，2008 IEEE International Conference on Cybernetics and Intelligent Systems, CIS 2008
SYGNET: A SVD-YOLO BASED GHOSTNET FOR REAL-TIME DRIVING SCENE PARSING
2022，29th IEEE International Conference on Image Processing, ICIP 2022
Temporal Action Detection Based on Temporal Convolutional Network with Coarse-Grained Clips Construction
2019，14th IEEE International Conference on Intelligent Systems and Knowledge Engineering, ISKE 2019
Forward Collision Warning system based on vehicle detection and tracking
2016，2016 International Conference on Optoelectronics and Image Processing, ICOIP 2016

Source ：

ISSN： 2194-5357

Year： 2020

Volume： 1084 AISC

Page： 199-205

Language： English

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 5

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

信息学部

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to