首页 | 本学科首页   官方微博 | 高级检索  
     

基因表达数据判别分析的随机森林方法
引用本文:武晓岩,李康. 基因表达数据判别分析的随机森林方法[J]. 中国卫生统计, 2006, 23(6): 491-494
作者姓名:武晓岩  李康
作者单位:哈尔滨医科大学卫生统计学教研室,150001
基金项目:国家自然科学基金;黑龙江省科研项目
摘    要:目的探讨随机森林算法在基因表达数据分类研究中的应用。方法通过实际基因表达数据考核其应用效果,并通过模拟试验进一步验证和研究在存在大量无差异表达基因情况下对分类产生的影响。结果随机森林算法对基因表达数据的分类具有较高的准确性,但随着基因数量的增加其判别效果呈下降的趋势,在差异表达基因之间具有相关性时,下降趋势明显减慢,能够获得较理想的分类效果。结论随机森林算法对基因表达数据的分类研究有较好的判别效果。

关 键 词:分类树  随机森林  基因表达数据  模拟试验

The Application of Random Forests for the Classification of Gene Expression Data
Wu Xiaoyan,Li Kang. The Application of Random Forests for the Classification of Gene Expression Data[J]. Chinese Journal of Health Statistics, 2006, 23(6): 491-494
Authors:Wu Xiaoyan  Li Kang
Affiliation:Department of Biostatistics, Harbin Medical University (150001
Abstract:Objective We investigate the use of random forests for classification of gene expression data. Methods The method is applied to real datasets. The result of simulated experiment validation shows the effect of classification with many undifferentiated expressed genes. Results Random Forests preserves excellent performance in class prediction with gene expression data but decline exists when the number of genes increases. We can obtain better predictive accuracy that the decline is slower when differentially expressed genes are related. Conclusion Random Forests possesses excellent performance in the classification of gene expression data.
Keywords:Classification tree   Random forests   Gene ex-pression data   Simulated experiment
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号