首页 | 本学科首页   官方微博 | 高级检索  
     

XGBoost算法在二分类非平衡高维数据分析中的应用
引用本文:卢娅欣,黄月,李康. XGBoost算法在二分类非平衡高维数据分析中的应用[J]. 中国卫生统计, 2021, 0(1): 21-24
作者姓名:卢娅欣  黄月  李康
作者单位:哈尔滨医科大学卫生统计学教研室
基金项目:国家自然科学基金(81973149,81773551)。
摘    要:目的 探讨XGBoost算法在二分类高维非平衡数据中的分类判别效果.方法 通过模拟实验及真实代谢组学数据分析,对XGBoost、随机森林、支持向量机、随机欠采样以及随机梯度提升树共五种方法进行比较.结果 模拟实验显示,XGBoost算法在数据非平衡较明显时,在各种实验条件下均优于或不劣于其他四种算法,在数据类别趋于平衡...

关 键 词:极端梯度提升算法  高维组学数据  分类判别

Application of XGBoost to the Analysis of Class-imbalanced High-dimensional Omics Data
Lu Yaxin,Huang Yue,Li Kang. Application of XGBoost to the Analysis of Class-imbalanced High-dimensional Omics Data[J]. Chinese Journal of Health Statistics, 2021, 0(1): 21-24
Authors:Lu Yaxin  Huang Yue  Li Kang
Affiliation:(Department of Medical Statistics,Harbin Medical University(150081),Harbin)
Abstract:Objective To explore the performance of classification by XGBoostmodel in the case of Class-imbalanced High-dimensional Omics Data.Methods XGBoost was compared withRF,SVM,random under-samplingand SGBT by analysis of simulation experiments and actual metabolomics data.Results Simulation experiments showed that XGBoost is superior to the other four algorithms under various experimental conditions when the data is obviously class-imbalanced,it also has good classification effect when the data are nearly balanced,and has anti-interference ability to noise variables.Actual data showed that compared with the other four algorithms,XGBoost has the best classification performance and faster calculation speed on the basis of ensuring the classification effect.Conclusion XGBoost is suitable for discriminant analysis of class-imbalanced high dimensional omics data,and is worthwhile to further research.
Keywords:XGBoost  High dimensional omics data  Classification
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号