收录:
摘要:
Feature extraction is essential for text classification. In this paper we discussed the basic ideas behind word-clustering-based feature extraction. Then a text classification method for feature extraction by the means of words clustering was presented. It employed an improved tree-structured growing self-organization map (TGSOM) to carry out word clustering. Also a new formula for calculating weights was developed by taking account of the distinction between clustered word features and plain word features. Finally, the SPRINT decision tree was applied to complete the text classification. Experiments showed that the precision of text classification using the proposed method is improved by 4.32%.
关键词:
通讯作者信息:
电子邮件地址: