• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Wu, Xiaojun (Wu, Xiaojun.) | Fang, Liying (Fang, Liying.) | Wang, Pu (Wang, Pu.) (学者:王普) | Yu, Nan (Yu, Nan.)

收录:

CPCI-S

摘要:

Chinese text classification is always challenging, especially when data are high dimensional and sparse. In this paper, we are interested in the way of text representation and dimension reduction in Chinese text classification. First, we introduces a topic model ------ Latent Dirichlet Allocation(LDA), which is uses LDA model as a dimension reduction method. Second, we choose Support Vector Machine(SVM) as the classification algorithm. Next, a method of text classification based on LDA and SVM is described. Finally, we choose documents with large number of Chinese text for experiment. Compared with LDA method and the traditional TF*IDF method, the experimental results show that LDA method runs a better results both on the classification accuracy and running time.

关键词:

dimension reduction LDA text classification

作者机构:

  • [ 1 ] [Wu, Xiaojun]Beijing Univ Technol, Dept Elect Informat & Control Engn, Beijing, Peoples R China
  • [ 2 ] [Fang, Liying]Beijing Univ Technol, Dept Elect Informat & Control Engn, Beijing, Peoples R China
  • [ 3 ] [Wang, Pu]Beijing Univ Technol, Dept Elect Informat & Control Engn, Beijing, Peoples R China
  • [ 4 ] [Yu, Nan]Beijing Univ Technol, Dept Elect Informat & Control Engn, Beijing, Peoples R China

通讯作者信息:

  • [Fang, Liying]Beijing Univ Technol, Dept Elect Informat & Control Engn, Beijing, Peoples R China

查看成果更多字段

相关关键词:

相关文章:

来源 :

2015 IEEE 28TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE)

ISSN: 0840-7789

年份: 2015

页码: 1260-1264

语种: 英文

被引次数:

WoS核心集被引频次: 12

SCOPUS被引频次:

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 2

在线人数/总访问数:2283/2934769
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司