首页 | 本学科首页   官方微博 | 高级检索  
     

基于H2O自动化机器学习的肝硬化患者死亡预测模型的建立
引用本文:王玉,徐中华,虞卫新,张辉,于倩倩,段文斌. 基于H2O自动化机器学习的肝硬化患者死亡预测模型的建立[J]. 中国普通外科杂志, 2023, 32(7): 1071-1078
作者姓名:王玉  徐中华  虞卫新  张辉  于倩倩  段文斌
作者单位:1.江苏大学附属金坛医院,肝胆外科,江苏 常州 213200;2.江苏大学附属金坛医院 骨科,江苏 常州 213200;3.江苏大学附属金坛医院 肿瘤科,江苏 常州 213200;4.中南大学湘雅医院 儿科,湖南 长沙 410008;5.湖南省人民医院/湖南师范大学附属第一医院 肝胆外科,湖南 长沙 410005
基金项目:江苏省常州市科技局第十三批科技计划(应用基础研究)基金资助项目(CJ20210005、CJ20210006);江苏大学医教协同创新基金资助项目( JDY2022018)。
摘    要:背景与目的 晚期肝硬化患者往往出现一系列并发症,死亡风险增加。因此,尽早识别肝硬化死亡高风险具有重要的临床意义。本研究利用H2O平台自动化机器学习(AutoML)框架,建立预测肝硬化患者入院30 d死亡模型,以期为改善肝硬化患者预后以及肝硬化临床管理提供新的方法。方法 收集江苏大学附属金坛医院及湖南省人民医院肝硬化住院患者入院时一般资料及实验室检查数据。利用H2O AutoML框架建立针对死亡结局的多种机器学习算法模型,绘制受试者工作特征(ROC)曲线并建立混淆矩阵来评价模型效力,同时对重要变量进行可视化呈现。结果 最佳模型为梯度提升机(GBM),Gini值0.994,R2为0.775,LogLoss为0.120。模型中重要变量包括凝血酶原时间、肌酐、白细胞及年龄。变量SHAP特征图及部分依赖图呈现了重要变量与模型整体预测的相关性。局部可解析性算法(LIME)可视化显示变量在个体预测的作用。最佳模型GBM在验证集中特异度为0.950,敏感度0.676,ROC曲线下面积(AUC)为0.793,优于基于极致梯度提升(XGBoost)、Logistic回归、随机森林和深度学习四个算法模型,以及终末期肝病模型(MELD)及白蛋白-胆红素(ALBI)评分。结论 所建立的预测短期死亡机器学习模型对肝硬化患者的短期死亡风险筛查提供了有效的工具,但其可靠性仍需多中心的外部验证进一步评估。

关 键 词:肝硬化  机器学习  模型,统计学  混淆矩阵  数据可视化
收稿时间:2022-03-03
修稿时间:2023-01-10

Development of a prediction model for mortality in liver cirrhosis patients based on H2O automated machine learning
WANG Yu,XU Zhonghu,YU Weixin,ZHANG Hui,YU Qianqian,DUAN Wenbin. Development of a prediction model for mortality in liver cirrhosis patients based on H2O automated machine learning[J]. Chinese Journal of General Surgery, 2023, 32(7): 1071-1078
Authors:WANG Yu  XU Zhonghu  YU Weixin  ZHANG Hui  YU Qianqian  DUAN Wenbin
Affiliation:1.Department of Hepatobiliary Surgery, Jintan Affiliated Hospital of Jiangsu University, Changzhou, Jiangsu 213200, China;2.Department of Orthopaedics, Jintan Affiliated Hospital of Jiangsu University, Changzhou, Jiangsu 213200, China;3.Department of Oncology, Jintan Affiliated Hospital of Jiangsu University, Changzhou, Jiangsu 213200, China;4.Department of Pediatrics, Xiangya Hospital, Central South University, Changsha 410008, China;5.Department of Hepatobiliary Surgery, Hunan Provincial People''s Hospital/the First Affiliated Hospital of Hunan Normal University, Changsha 410005, China
Abstract:Background and Aims Patients with advanced liver cirrhosis often experience a series of complications, leading to an increased risk of death. Therefore, early identification of high-risk patients for liver cirrhosis mortality is of significant clinical importance. In this study, we used the H2O platform and automated machine learning (AutoML) framework to develop a predictive model for 30-d in-hospital mortality in liver cirrhosis patients, aiming to provide new methods for improving patient prognosis and clinical management of liver cirrhosis.Methods General information and laboratory examination data were collected from hospitalized liver cirrhosis patients at Jintan Hospital affiliated with Jiangsu University and Hunan Provincial People''s Hospital. Multiple machine learning algorithm models for mortality outcomes were established using the H2O AutoML framework. ROC curves were plotted, and confusion matrices were used to evaluate the performance of the models. Furthermore, important variables were visualized.Results The best model, gradient boosting machine (GBM), had a Gini value of 0.994, R2 of 0.775, and LogLoss of 0.120. Important variables in the model included prothrombin time, creatinine, white blood cells, and age. The SHAP feature graph and partial dependence graph demonstrated the correlation between important variables and the overall predictions of the model. LIME visualization showed the individual predictive effects of the variables. The best GBM model had a specificity of 0.950, sensitivity of 0.676, and AUC of 0.793 in the validation set, outperforming four algorithm models (XGBoost, Logistic regression, random forest, and deep learning), as well as the MELD and ALBI scores.Conclusions The established machine learning model for predicting short-term mortality provides an effective tool for screening the risk of short-term death in patients with liver cirrhosis. However, its reliability still needs further evaluation through external validation from multiple centers.
Keywords:Liver Cirrhosis  Machine Learning  Models, Statistical  Confusion Matrix  Data Visualization
点击此处可从《中国普通外科杂志》浏览原始摘要信息
点击此处可从《中国普通外科杂志》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号