首页 | 本学科首页   官方微博 | 高级检索  
检索        

基于PCA和LDA方法的肿瘤基因表达谱数据分类
引用本文:李志文,蔡先发,韦佳,周怡.基于PCA和LDA方法的肿瘤基因表达谱数据分类[J].北京生物医学工程,2014,33(1):47-51.
作者姓名:李志文  蔡先发  韦佳  周怡
作者单位:广东药学院医药信息工程学院 广州510006;广东药学院医药信息工程学院 广州510006;华南理工大学计算机科学与工程学院 广州510006;华南理工大学计算机科学与工程学院 广州510006
基金项目:华南理工大学中央高校基本科研业务费专项资金(项目编号:2009ZM0189)
摘    要:目的基因芯片技术对医学临床诊断、治疗、药物开发和筛选等技术的发展具有革命性的影响。针对高维医学数据降维困难及基因表达谱样本数据少、维度高、噪声大的特点,维数约减十分必要。基于主成分分析(principalcomponentanalysis,PCA)和线性判别分析(1ineardiscriminantanalysis,LDA)方法,有效解决了基因表达谱数据分类问题,并提高了识别率。方法分别引人PCA和LDA方法对基因表达谱数据进行降维,然后用K近邻(K—nearestneighbor,KNN)作为分类器对数据进行分类,并分别在乳腺癌和卵巢癌质谱数据上。结果在两类癌症质谱数据上应用PCA和LDA方法能够有效提取分类特征信息,并在保持较高分类正确率的前提下大幅度降低医学数据的维数。结论利用维数约减的方法对癌症基因表达谱数据进行分类,可辅助临床医生发现新的疾病特征,提高疾病诊断的正确率。

关 键 词:主成分分析  线性判别分析  基因表达数据分类  维数约减

Classification of cancer gene expression profile based on PCA and LDA
LI Zhiwen,CAI Xianfa,WEI Jia,ZHOU Yi.Classification of cancer gene expression profile based on PCA and LDA[J].Beijing Biomedical Engineering,2014,33(1):47-51.
Authors:LI Zhiwen  CAI Xianfa  WEI Jia  ZHOU Yi
Institution:1Medical Information Engineering School, Guangdong Pharmaceutical University, Guangzhou 510006 ; 2 School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006)
Abstract:Objective Gene chip technology has a revolutionary influence on clinical diagnosis, treatment,drug development and screening. To resolve the difficulty of high medical data' s feature reduction and small sample, high dimensions and great noise of gene expression profile, feature reduction is extremely necessary. The experimental results demonstrate that principal component analysis (PCA) and linear discriminant analysis (LDA) classification methods can effectively resolve the problem of classification of gene expression profile while maintaining higher classification accuracy. Methods PCA and LDA methods were used to extract the features and reduce the dimensions, then K-nearest neighbor (KNN) was used as a classifier. Results The experimental results on breast cancer and ovarian cancer datasets demonstrated that PCA and LDA classification methods could effectively extract feature information and greatly reduce the dimensions of medical data while maintaining high classification accuracy. Conclusions The application of feature reduction methods for gene expression data classification of cancer can assist clinicians to discover new disease characteristics andimprove diagnosis accuracy.
Keywords:principal component analysis  lineardiseriminant analysis  gene expression data classification  feature reduction
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号