基于Logistic回归与XGBoost构建缺血性卒中院内复发风险预测模型的初步比较研究 Comparison of Prediction Models for In-hospital Stroke Recurrence in Patients with Ischemic Stroke Based on Logistic Regression and XGBoost Methods期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于Logistic回归与XGBoost构建缺血性卒中院内复发风险预测模型的初步比较研究

引用本文：	谷鸿秋,王春娟,李子孝,王伊龙,王拥军,姜勇.基于Logistic回归与XGBoost构建缺血性卒中院内复发风险预测模型的初步比较研究[J].中国卒中杂志,2020,15(6):587-594.

作者姓名：	谷鸿秋王春娟李子孝王伊龙王拥军姜勇

作者单位：	1100070 北京首都医科大学附属北京天坛医院；国家神经系统疾病临床医学研究中心2国家神经系统疾病医疗质量控制中心3北京大数据精准医疗高精尖创新中心（北京航空航天大学&首都医科大学）

基金项目：	“十三五”国家重点研发计划（2016YFC0901001，2017YFC1310901，2016YFC0901002，2017YFC1307905，2015BAI12B00）中国医学科学院脑血管病人工智能研究创新单元（2019RU018）北京市科学技术委员会基于人工智能的脑血管病临床诊疗决策研究（Z201100005620010）北京市百千万人才工程（2018A13）北京市青年拔尖人才项目（2018000021223ZK03）

摘要：	目的基于Logistic回归和XGBoost方法构建缺血性卒中院内复发风险预测模型,并进行初步比较。方法利用中国国家卒中登记Ⅱ(China National Stoke RegistryⅡ,CNSRⅡ)数据库中按医嘱离院的缺血性卒中患者数据,分别基于Logistic回归和XGBoost方法构建缺血性卒中院内复发风险预测模型。备选的预测因子包括人口学特征、卒中严重程度、既往病史、用药史以及临床测量指标。模型的评价指标包括ROC曲线下面积(area under the cure,AUC)、校准截距、校准斜率以及Brier得分。所有统计分析均在R(3.6.2版)中完成。结果最终纳入17227例符合条件的患者,平均年龄64.72±11.84岁,女性6317例(36.7%),发病前mRS评分为0或1分的病例14482例(84.1%),入院NIHSS评分4(2~6)分,院内卒中复发444例(2.6%)。预测模型识别的前三位强预测因子,在Logistic回归中为发病前mRS评分、心房颤动及卒中史;在XGBoost中为发病前mRS评分、心房颤动及总胆固醇。Logistic回归预测模型与XGBoost预测模型的AUC无显著差异(0.63,95%CI 0.58~0.68 vs 0.64,95%CI 0.59~0.68,P=0.9229)。Logistic预测模型校准截距、校准斜率以及Brier得分分别为-0.81、0.76和0.03;XGBoost预测模型的校准截距、校准斜率以及Brier得分分别为-1.37、1.20和0.38。Logistic预测模型校准度更好。结论利用CNSRⅡ数据构建的缺血性卒中院内复发风险预测模型应用中,基于XGBoost方法构建的预测模型相比Logistic回归构建的预测模型的区分度没有显著差异,但校准度略低。
关键词：	缺血性卒中院内复发预测模型
收稿时间：	2020-03-01
Comparison of Prediction Models for In-hospital Stroke Recurrence in Patients with Ischemic Stroke Based on Logistic Regression and XGBoost Methods

GU Hong-Qiu,WANG Chun-Juan,LI Zi-Xiao,WANG Yi-Long,WANG Yong-Jun,JIANG Yong.Comparison of Prediction Models for In-hospital Stroke Recurrence in Patients with Ischemic Stroke Based on Logistic Regression and XGBoost Methods[J].Chinese Journal of Stroke,2020,15(6):587-594.

Authors:	GU Hong-Qiu WANG Chun-Juan LI Zi-Xiao WANG Yi-Long WANG Yong-Jun JIANG Yong

Institution:	(Beijing Tian Tan Hospital,Capital Medical University,China National Clinical Research Center for Neurological Diseases,Beijing 100070,China;National Center for Healthcare Quality Management in Neurological Diseases,Beijing 100070,China;Beijing Advanced Innovation Center for Big Data-Based Precision Medicine(Beihang University&Capital Medical University),Beijing 100091,China)

Abstract:	Objective To compare prediction models for in-hospital stroke recurrence in patients with ischemic stroke based on logistic regression and XGBoost methods. Methods Data of ischemic stroke inpatients discharged according to medical advice from China National Stroke Registry Ⅱ (CNSR Ⅱ) database were retrospectively analyzed. Logistic regression and XGBoost methods were used to develop a model for predicting in-hospital stroke recurrence. Candidate predictors included demographic characteristics, stroke severity, medical history, medication history, and clinical measure indicators. The performance measures of the predictive models included the area under the receiver operating characteristic curve (AUC), calibration intercept, calibration slope and Brier score. All statistical analysis was performed using R (version 3.6.2). Results A total of 17 227 eligible patients were included in this analysis. The mean age was 64.72±11.84 years, and 6317 (36.7%) cases were females. A total of 14 482 (84.1%) patients had amRS score of 0 or 1 point before symptoms onset, and the NIHSS score at admission was 4 (2-6). A total of 444 (2.6%) patients had recurrent stroke during hospitalization. The three leading strong predictors were mRS score, atrial fibrillation and stroke history in logistic regression model, and mRS score, atrial fibrillation and total cholesterol in XGBoost model. No significant difference was found in AUC between logistic regression model and XGBoost model (0.63, 95%CI 0.58-0.68 vs 0.64, 95%CI 0.59-0.68, P =0.9229). The calibration intercept, calibration slope and Brier score in logistic regression model were -0.81, 0.76 and 0.03, respectively; and were -1.37, 1.20 and 0.38 in XGBoost model. Logistic regression model had better calibration than XGBoost model. Conclusions No significant difference was found in discrimination between logistic-based prediction model and XGBoost-based prediction model for in-hospital stroke recurrence constructed using data of CNSR II, while logistic-based prediction model had better calibration.

Keywords:	Ischemic stroke In-hospital stroke recurrence Prediction model
本文献已被 CNKI 维普等数据库收录！
	点击此处可从《中国卒中杂志》浏览原始摘要信息
	点击此处可从《中国卒中杂志》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏