A hybrid feature selection method for DNA microarray data |
| |
Authors: | Chuang Li-Yeh Yang Cheng-Huei Wu Kuo-Chuan Yang Cheng-Hong |
| |
Institution: | aDepartment of Chemical Engineering, I-Shou University, Kaohsiung 80041, Taiwan;bDepartment of Electronic Communication Engineering, National Kaohsiung Marine University, Kaohsiung 81157, Taiwan;cDepartment of Computer Science and Information Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung 80708, Taiwan;dDepartment of Network Systems, Toko University, Chiayi 61363, Taiwan;eDepartment of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung 80708, Taiwan |
| |
Abstract: | Gene expression profiles, which represent the state of a cell at a molecular level, have great potential as a medical diagnosis tool. In cancer classification, available training data sets are generally of a fairly small sample size compared to the number of genes involved. Along with training data limitations, this constitutes a challenge to certain classification methods. Feature (gene) selection can be used to successfully extract those genes that directly influence classification accuracy and to eliminate genes which have no influence on it. This significantly improves calculation performance and classification accuracy. In this paper, correlation-based feature selection (CFS) and the Taguchi-genetic algorithm (TGA) method were combined into a hybrid method, and the K-nearest neighbor (KNN) with the leave-one-out cross-validation (LOOCV) method served as a classifier for eleven classification profiles to calculate the classification accuracy. Experimental results show that the proposed method reduced redundant features effectively and achieved superior classification accuracy. The classification accuracy obtained by the proposed method was higher in ten out of the eleven gene expression data set test problems when compared to other classification methods from the literature. |
| |
Keywords: | Feature selection Taguchi-genetic algorithm K-nearest neighbor Leave-one-out cross-validation |
本文献已被 ScienceDirect PubMed 等数据库收录! |
|