首页 | 本学科首页   官方微博 | 高级检索  
检索        

基于随机森林与logistic回归的高血压影响因素研究
引用本文:王福成,,齐平,蒋剑军,黄永,杨晓玲.基于随机森林与logistic回归的高血压影响因素研究[J].现代预防医学,2020,0(13):2310-2313.
作者姓名:王福成    齐平  蒋剑军  黄永  杨晓玲
作者单位:1.合肥工业大学 管理学院,安徽 合肥 230009;2.铜陵学院 服务计算研究所,安徽 铜陵 244000;3.铜陵市疾病预防控制中心,安徽 铜陵 244000;4.天桥社区卫生服务站,安徽 铜陵 244000
摘    要:目的 针对铜陵市天桥社区居民体检数据中多因素、有效样本有限的情况,挖掘与分析高血压影响因素与因素间的交互效应,为高血压干预提供参考。方法 选取2017年该社区801例体检数据为研究对象,采用随机森林方法,筛选出重要性评分较大的特征,代入logistic完全二次回归模型,逐步回归分析影响因素及因素间的交互效应。结果 随机森林模型准确率83.67%,特征重要性前10项为年龄、糖尿病、锻炼频率、体质指数、总胆固醇、吸烟情况、饮酒情况、中心性肥胖、甘油三酯、血尿素氨。Logistic完全二次回归模型准确率84.17%,输出2条主效应、8条二次交互效应。主效应中有统计学意义(P<0.05)的特征有年龄、锻炼频率,二次交互效应中有统计学意义(P<0.05)的特征有年龄、糖尿病、体质指数、总胆固醇、吸烟情况、饮酒情况、甘油三酯、血尿素氨。结论 随机森林与logistic完全二次回归模型相结合,解决了经典方法难以从多因素、样本有限的数据中挖掘交互效应的问题,获得高血压影响因素与因素间的交互效应,为高血压干预提供有益的指导。

关 键 词:高血压  随机森林  Logistic回归  影响因素  交互效应

Influential factors of hypertension based on random forest and logistic regression
WANG Fu-cheng,QI Ping,JIANG Jian-jun,HUANG Yong,YANG Xiao-ling.Influential factors of hypertension based on random forest and logistic regression[J].Modern Preventive Medicine,2020,0(13):2310-2313.
Authors:WANG Fu-cheng  QI Ping  JIANG Jian-jun  HUANG Yong  YANG Xiao-ling
Institution:*School of Management, Hefei University of Technology,Hefei, Anhui 230009, China
Abstract:To analyze the interaction effects of hypertension influencing factors through multiple factors and limited effective samples in the physical examination data of Tianqiao Community residents in Tongling,so as to provide a reference for hypertension intervention. Methods 801 physical examination data of the community in 2017 were selected as the research object. The random forest method was used to select the characteristics with large importance scores,and the logistic complete quadratic regression model was used to analyze the interaction effects among the influencing factors. Results The random forest model had an accuracy rate of 83. 67% . The top 10 items of characteristic importance were age,diabetes,exercise frequency,body mass index,total cholesterol,smoking status,drinking status,central obesity,triglycerides,and blood urea ammonia. The logistic complete quadratic regression model had an accuracy rate of 84. 17% ,and 2 main effects and 8 quadratic interaction effects were outputted. The main effects with statistically significant ( P < 0. 05) were age and exercise frequency,and the secondary interaction effects with statistically significant ( P < 0. 05) were age,diabetes,body mass index,total cholesterol,smoking status,drinking conditions,triglycerides and blood urea ammonia. Conclusion The combination of random forest and logistic complete quadratic regression model can excavate interaction effects from multi - factor and limited sample data,which is difficult for classic methods. It can also obtain the interaction effects of hypertension influencing factors,which provides beneficial guidance for hypertension intervention.
Keywords:Hypertension  Random forest  Logistic regression  Influencing factors  Interaction effects
本文献已被 CNKI 等数据库收录!
点击此处可从《现代预防医学》浏览原始摘要信息
点击此处可从《现代预防医学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号