Indexed by:
Abstract:
Feature selection is an important process to choose a subset of features relevant to a particular application in text classification. Based on the mutual information method, we designed variance-mean based feature selection (VM). After computing and ranking the variance of class discrimination value vector for each word, we can choose the most distinguishable features. This method has advantages in the case of choosing smaller number of features, especially for classes with small number of training documents. It keeps the best features, and thus improves the final performance of the classification system. The experiment results indicate the effectiveness of the proposed feature selection method in a text classification.
Keyword:
Reprint Author's Address:
Email:
Source :
PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON EDUCATION TECHNOLOGY AND COMPUTER SCIENCE, VOL III
Year: 2009
Page: 519-522
Language: English
Cited Count:
WoS CC Cited Count: 3
SCOPUS Cited Count: 4
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 2
Affiliated Colleges: