首页 | 本学科首页   官方微博 | 高级检索  
检索        


Electronic phenotyping of health outcomes of interest using a linked claims-electronic health record database: Findings from a machine learning pilot project
Authors:Teresa B Gibson  Michael D Nguyen  Timothy Burrell  Frank Yoon  Jenna Wong  Sai Dharmarajan  Rita Ouellet-Hellstrom  Wei Hua  Yong Ma  Elande Baro  Sarah Bloemers  Cory Pack  Adee Kennedy  Sengwee Toh  Robert Ball
Abstract:ObjectiveClaims-based algorithms are used in the Food and Drug Administration Sentinel Active Risk Identification and Analysis System to identify occurrences of health outcomes of interest (HOIs) for medical product safety assessment. This project aimed to apply machine learning classification techniques to demonstrate the feasibility of developing a claims-based algorithm to predict an HOI in structured electronic health record (EHR) data.Materials and MethodsWe used the 2015-2019 IBM MarketScan Explorys Claims-EMR Data Set, linking administrative claims and EHR data at the patient level. We focused on a single HOI, rhabdomyolysis, defined by EHR laboratory test results. Using claims-based predictors, we applied machine learning techniques to predict the HOI: logistic regression, LASSO (least absolute shrinkage and selection operator), random forests, support vector machines, artificial neural nets, and an ensemble method (Super Learner).ResultsThe study cohort included 32 956 patients and 39 499 encounters. Model performance (positive predictive value PPV], sensitivity, specificity, area under the receiver-operating characteristic curve) varied considerably across techniques. The area under the receiver-operating characteristic curve exceeded 0.80 in most model variations.DiscussionFor the main Food and Drug Administration use case of assessing risk of rhabdomyolysis after drug use, a model with a high PPV is typically preferred. The Super Learner ensemble model without adjustment for class imbalance achieved a PPV of 75.6%, substantially better than a previously used human expert-developed model (PPV = 44.0%).ConclusionsIt is feasible to use machine learning methods to predict an EHR-derived HOI with claims-based predictors. Modeling strategies can be adapted for intended uses, including surveillance, identification of cases for chart review, and outcomes research.
Keywords:supervised machine learning  administrative claims  healthcare  electronic health records  rhabdomyolysis  electronic phenotyping
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号