首页 | 官方网站   微博 | 高级检索  
     

基于机器学习算法的2型糖尿病患者3个月血糖预测
引用本文:覃伟,高敏,沈莹,史宇晖,吴涛,赵艾,孙昕霙.基于机器学习算法的2型糖尿病患者3个月血糖预测[J].中华疾病控制杂志,2019,23(11):1313-1317.
作者姓名:覃伟  高敏  沈莹  史宇晖  吴涛  赵艾  孙昕霙
作者单位:北京大学公共卫生学院社会医学与健康教育学系,北京,100191;北京大学公共卫生学院流行病与卫生统计学系,北京,100191
基金项目:国家自然科学基金71673009
摘    要:  目的  评价Logistic回归算法和随机森林算法对2型糖尿病患者3个月后血糖控制情况的预测效果,并探究血糖控制的影响因素。  方法  收集顺义、通州区2型糖尿病患者的基线调查和随访信息,以患者3个月后糖化血红蛋白是否大于6.5%作为结局分类变量,使用随机森林算法和Logistic算法建立预测模型,通过受试者工作特征曲线下面积(area under the curve,AUC)、灵敏度等指标比较预测效果。  结果  患者血糖控制效果的影响因素有基线空腹血糖(P < 0.001)、病程(P < 0.001)、吸烟(P=0.026)、静态活动时间(P=0.006)、体重指数(超重P=0.002,肥胖P=0.011)、手环使用(P=0.028)和糖尿病饮食(P=0.002)7个因素;Logistic回归预测模型的AUC为0.738,灵敏度为72.9%,特异度68.1%,准确率71.2%,随机森林模型的AUC为0.756,灵敏度74.5%,特异度69.5%,准确率72.8%。  结论  随机森林算法预测效果优于Logistic回归预测模型,可应用于血糖控制效果预测,辅助糖尿病患者的管理。

关 键 词:2型糖尿病  分类预测  随机森林算法  Logistic回归算法
收稿时间:2019-07-17

Prediction of 3-mouth glycemic control in type 2 diabetes mellitus based on machine learning algorithm
Affiliation:1.Department of Social Medicine and Health Education, School of Public Health, Peking University, Beijing 100191, China2.Department of Epidemiology and Health Statistics, School of Public Health, Peking University, Beijing 100191, China
Abstract:  Objective  To evaluate the efficiency of Logistic regression algorithm and random forest algorithm in prediction of blood glucose control in patients with type 2 diabetes mellitus (T2DM) after 3 months, and explore the influencing factors of blood glucose control.  Methods  The data was extracted from baseline survey and follow-up information of patients with T2DM in Shunyi and Tongzhou Districts. The patient's 3-month glycosylated hemoglobin which was more than 6.5% was chosen as the outcome categorical variable. The random forest algorithm and Logistic algorithm were used to establish the prediction model. The predictive efficiency was evaluated with the area under receive operating characteristic curve (AUC) and accuracy rate.  Results  Factors affecting the patient's glycemic control included baseline fasting plasma glucose(P < 0.001), duration of disease(P < 0.001), smoking(P=0.026), static activity time(P=0.006), body mass index(overweight P=0.002, obesity P=0.011), bracelet use(P=0.028), and diabetes diet(P=0.002).The Logistic regression prediction model had an AUC of 0.738, a sensitivity of 72.9%, a specificity of 68.1%, and an accuracy of 71.2%. The random forest model had an AUC of 0.756, a sensitivity of 74.5%, a specificity of 69.5%, and an accuracy of 72.8%.  Conclusions  The efficiency of random forest is better than Logistic regression model, which can be applied to the prediction of blood glucose control and assist the management of diabetic patients.
Keywords:
本文献已被 万方数据 等数据库收录!
点击此处可从《中华疾病控制杂志》浏览原始摘要信息
点击此处可从《中华疾病控制杂志》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号