首页 | 本学科首页   官方微博 | 高级检索  
检索        

Boosting算法结合SMOTE技术在青年男男性行为者HIV感染预测中的应用
引用本文:王肖萌,宋德胜,张甜甜,常琴雪,王淳,王柯云,刘媛媛,李长平,崔壮,马骏.Boosting算法结合SMOTE技术在青年男男性行为者HIV感染预测中的应用[J].中国卫生统计,2022(1).
作者姓名:王肖萌  宋德胜  张甜甜  常琴雪  王淳  王柯云  刘媛媛  李长平  崔壮  马骏
作者单位:天津医科大学公共卫生学院流行病与卫生统计学系
基金项目:教育部人文社会科学研究规划基金(20YJAZH021);教育部人文社会科学研究青年基金(11YJCZH022)。
摘    要:目的评价Boosting算法结合SMOTE技术预测青年男男性行为者(YMSM)HIV感染状况的性能。方法通过网络和现场抽取2018-2019年天津市YMSM 1179名,分别用XGBoost、LightGBM、CatBoost和logistic结合SMOTE技术建立预测模型,通过AUC、F1、Accuracy、Brier score等指标评价其分类性能。结果应用SMOTE合成数据后,logistic、CatBoost、LightGBM和XGBoost的AUC分别提升了23.4%、24.0%、25.4%和26.8%,Boosting算法的分类性能优于logistic模型。结论Boosting算法结合SMOTE技术为类不平衡数据的分类预测提供了新思路。

关 键 词:BOOSTING算法  SMOTE  青年男男性行为者  HIV

Application of Boosting Algorithms with SMOTE in Predicting HIV Infection among Young Men Who Have Sex with Men
Institution:(Department of Health Statistics,Public Health College,Tianjin Medical University(300070),Tianjin)
Abstract:Objective To evaluat the performance of Boosting algorithms combined with SMOTE in predicting HIV infection among young men who have sex with men(YMSM).Methods 1179 YMSM in Tianjin were selected through internet or site from 2018 to 2019.XGBoost,LightGBM,CatBoost and logistic combined with SMOTE were used respectively to build prediction models.The classification performance of models was evaluated by AUC,F1,Accuracy,Brier score and more.Results The AUC of logistic,CatBoost,LightGBM and XGBoost increased by 23.4%,24.0%,25.4%and 26.8%respectively after using SMOTE-based synthetic data and the Boosting algorithms outperformed logistic model.Conclusion Boosting algorithms combined with SMOTE provides a new way for prediction and classification of imbalanced data.
Keywords:Boosting algorithms  SMOTE  YMSM  HIV
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号