基于SMOTE算法与机器学习的老年人健康素养预测研究 Prediction of health literacy of the elderly based on SMOTE algorithm and machine learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于SMOTE算法与机器学习的老年人健康素养预测研究

引用本文：	王可,赵华硕,张虹,黄水平,金英良,曾平.基于SMOTE算法与机器学习的老年人健康素养预测研究[J].中国校医,2019,33(9):641.

作者姓名：	王可赵华硕张虹黄水平金英良曾平

作者单位：	1.徐州医科大学公共卫生学院流行病与卫生统计学教研室,江苏徐州 221004;2.徐州市儿童医院重症医学科

基金项目：	江苏高校哲学社会科学研究项目(2015SJD455)

摘要：	目的探讨合成少数类过采样技术(SMOTE)结合机器学习模型在老年人是否具备健康素养预测评估中的应用。方法利用单因素筛选从资料中筛选出与是否具备健康素养有关联的变量;以筛选出的变量作为输入变量,以是否具备健康素养为结局变量,分别在经SMOTE算法处理前后的数据集中建立logistic回归模型、随机森林和SVM模型,通过受试者工作特征曲线(ROC)来评价模型性能。结果 Logistic回归、随机森林和SVM在SMOTE算法处理前的测试集中的准确率分别为0.833、0.600和0.636,3种模型的ROC曲线下面积(AUC)分别为0.723、0.815和0.728;在SMOTE算法处理后的测试集中的准确率分别为0.936、0.908和0.890,3种模型的AUC分别为0.896、0.944和0.897。结论随机森林模型在老年人是否具备健康素养的预后评估中具有较高的应用价值。
关键词：	老年人健康素养模型统计学
收稿时间：	2019-07-18
Prediction of health literacy of the elderly based on SMOTE algorithm and machine learning

WANG Ke,ZHAO Hua-shuo,ZHANG Hong,HUANG Shui-ping,JIN Ying-liang,ZENG Ping.Prediction of health literacy of the elderly based on SMOTE algorithm and machine learning[J].Chinese Journal of School Doctor,2019,33(9):641.

Authors:	WANG Ke ZHAO Hua-shuo ZHANG Hong HUANG Shui-ping JIN Ying-liang ZENG Ping

Institution:	Department of Epidemiology and Health Statistics, School of Public Health, Xuzhou Medical University, Xuzhou 221004, Jiangsu, China

Abstract:	Objective To explore the application of synthetic minority oversampling technique (SMOTE) algorithm combined with machine learning model in the evaluation of whether the elderly have health literacy prediction. Methods Single factor screening was used to screen out the variables associated with health literacy; the selected variables were used as input variables, and whether there was health literacy was as the outcome variable. The logistic regression was established in the datum sets before and after SMOTE algorithm processing. The models, random forests, and support vector machines (SVM) models were used to evaluate the model performance by receiver operating characteristic (ROC) curve. Results The accuracies of the logistic regression, random forest and SVM in the test set before SMOTE algorithm processing were 0.833, 0.600 and 0.636, respectively. The areas under the ROC curve (AUC) of the three models were 0.723, 0.815 and 0.728 respectively. After SMOTE algorithm processing, the accuracies of the test set were 0.936, 0.908, and 0.890, respectively, and the AUC of the three models were 0.896, 0.944, and 0.897, respectively. Conclusion The random forest model has a high application value in the prognosis evaluation of whether the elderly have health literacy.

Keywords:	elderly health literacy model statistics

	点击此处可从《中国校医》浏览原始摘要信息
	点击此处可从《中国校医》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏