收录:
摘要:
In clinical medicine, multidimensional time series data can be used to find the rules of disease progress by data mining technology, such as classification and prediction. However, in multidimensional time series data mining problems, the excessive data dimension causes the inaccuracy of probability density distribution to increase the computational complexity. Besides, information redundancy and irrelevant features may lead to high computational complexity and over-fitting problems. The combination of these two factors can reduce the classification performance. To reduce computational complexity and to eliminate information redundancies and irrelevant features, we improved upon a multidimensional time series feature selection method to achieve dimension reduction. The improved method selects features through the combination of the Kozacbenko-Leonenko (K-L) information entropy estimation method for feature extraction based on mutual information and the feature selection algorithm based on class separability. We performed experiments on the Electroencephalogram (EEG) dataset for verification and the non-small cell lung cancer (NSCLC) clinical dataset for application. The results show that with the comparison of CLeVer, Corona and AGV, respectively, the improved method can effectively reduce the dimensions of multidimensional time series for clinical data. (C) 2015 The Authors. Published by Elsevier Ltd.
关键词:
通讯作者信息:
电子邮件地址: