首页 | 本学科首页   官方微博 | 高级检索  
检索        

基于super learner算法的集成学习及其在纵向删失数据预测建模中的应用
引用本文:杨嵛惠,王静娴,赵芃,李业棉,陈方尧.基于super learner算法的集成学习及其在纵向删失数据预测建模中的应用[J].中国医院统计,2021(1).
作者姓名:杨嵛惠  王静娴  赵芃  李业棉  陈方尧
作者单位:西安交通大学医学部公共卫生学院流行病与卫生统计学系卫生统计学教研室
基金项目:国家自然科学基金(81703325)。
摘    要:目的集成学习是近年来机器学习领域中被广泛应用的一种新的、用来提高学习精度的算法。本文旨在介绍基于super learner算法的集成学习方法在纵向删失数据预测建模中的应用及其R语言实现。方法本文介绍了super learner算法的基本原理及其在纵向删失数据建模中的应用,以及如何在R语言中实现该算法的建模。其次,应用TCGA数据库中的肿瘤生存数据进行实例分析,展示其在实际数据分析中的应用效果。结果基于super learner算法的集成学习方法在建模时,模型参数估计方法的选择和算法参数的定义均较为灵活。在实际数据分析中,super learner算法可以充分利用所获得的数据建立模型,模型的预测准确度为0.8737(95%CI:0.7897~0.9330),C-index为0.883,预测准确性较高。结论基于super learner算法的集成学习方法为纵向删失数据的预测建模分析提供了新的选择。

关 键 词:集成学习  super  learner  预测模型  纵向删失数据  R语言

Ensemble learning based on super learner algorithm and its application in prediction modeling for longitudinal censored data
Yang Yuhui,Wang Jingxian,Zhao Peng,Li Yemian,Chen Fangyao.Ensemble learning based on super learner algorithm and its application in prediction modeling for longitudinal censored data[J].Chinese Journal of Hospital Statistics,2021(1).
Authors:Yang Yuhui  Wang Jingxian  Zhao Peng  Li Yemian  Chen Fangyao
Institution:(Department of Epidemiology and Biostatistics, School of Public Health, Xi'an Jiaotong University Health Science Center, Xi'an 710061, China)
Abstract:Objective Ensemble learning is a novel approach to improving learning accuracy in machine learning field recently.This paper aims to introduce the application of ensemble learning method based on super learner algorithm in the prediction modeling of longitudinal censored data and its implementation of R language.Methods This paper introduced the principle in modeling longitudinal censored data based on super learner algorithm and its implementation method with R-programming language.In addition,tumor survival data from TCGA database were used for real data analysis to illustrate its performance in practice.Results The estimation methods for model parameters and definition of ensemble learning parameters based on super learner algorithm are more flexible.In actual data analysis,super learner algorithm can make full use of the obtained data to establish the prediction model.The prediction accuracy of the model is 0.8737(95%CI:0.7897-0.9330)and the C-index is 0.883,so the prediction performance is good.Conclusion The ensemble learning approach with super learner algorithm provides a new choice for the prediction analysis based on longitudinal censored data.
Keywords:ensemble learning  super learner  prediction model  longitudinal censored data  R-programming language
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号