首页 | 本学科首页   官方微博 | 高级检索  
检索        


Improving condition severity classification with an efficient active learning based framework
Institution:1. Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel;2. Malware Lab, Cyber Security Research Center, Ben-Gurion University of the Negev, Beer-Sheva, Israel;3. Department of Biomedical Informatics, Columbia University, New York, NY, USA;4. Department of Systems Biology, Columbia University, New York, NY, USA;5. Department of Medicine, Columbia University, New York, NY, USA;6. Observational Health Data Sciences and Informatics, Columbia University, New York, NY, USA
Abstract:Classification of condition severity can be useful for discriminating among sets of conditions or phenotypes, for example when prioritizing patient care or for other healthcare purposes. Electronic Health Records (EHRs) represent a rich source of labeled information that can be harnessed for severity classification. The labeling of EHRs is expensive and in many cases requires employing professionals with high level of expertise. In this study, we demonstrate the use of Active Learning (AL) techniques to decrease expert labeling efforts. We employ three AL methods and demonstrate their ability to reduce labeling efforts while effectively discriminating condition severity. We incorporate three AL methods into a new framework based on the original CAESAR (Classification Approach for Extracting Severity Automatically from Electronic Health Records) framework to create the Active Learning Enhancement framework (CAESAR-ALE). We applied CAESAR-ALE to a dataset containing 516 conditions of varying severity levels that were manually labeled by seven experts. Our dataset, called the “CAESAR dataset,” was created from the medical records of 1.9 million patients treated at Columbia University Medical Center (CUMC). All three AL methods decreased labelers’ efforts compared to the learning methods applied by the original CAESER framework in which the classifier was trained on the entire set of conditions; depending on the AL strategy used in the current study, the reduction ranged from 48% to 64% that can result in significant savings, both in time and money. As for the PPV (precision) measure, CAESAR-ALE achieved more than 13% absolute improvement in the predictive capabilities of the framework when classifying conditions as severe. These results demonstrate the potential of AL methods to decrease the labeling efforts of medical experts, while increasing accuracy given the same (or even a smaller) number of acquired conditions. We also demonstrated that the methods included in the CAESAR-ALE framework (Exploitation and Combination_XA) are more robust to the use of human labelers with different levels of professional expertise.
Keywords:Active learning  Electronic Health Records  Phenotyping  Condition  Severity  CAESAR"}  {"#name":"keyword"  "$":{"id":"k0035"}  "$$":[{"#name":"text"  "_":"Classification Approach for Extracting Severity Automatically from Electronic Health Records  CAESAR-ALE"}  {"#name":"keyword"  "$":{"id":"k0045"}  "$$":[{"#name":"text"  "_":"Classification Approach for Extracting Severity Automatically from Electronic Health Records – Active Learning Enhancement  EHR"}  {"#name":"keyword"  "$":{"id":"k0055"}  "$$":[{"#name":"text"  "_":"Electronic Health Record  AL"}  {"#name":"keyword"  "$":{"id":"k0065"}  "$$":[{"#name":"text"  "_":"Active Learning  SVM"}  {"#name":"keyword"  "$":{"id":"k0075"}  "$$":[{"#name":"text"  "_":"Support Vector Machines  VS"}  {"#name":"keyword"  "$":{"id":"k0085"}  "$$":[{"#name":"text"  "_":"Version Space  SNOMED-CT"}  {"#name":"keyword"  "$":{"id":"k0095"}  "$$":[{"#name":"text"  "_":"Systemized Nomenclature of Medicine-Clinical Terms  ICD-9"}  {"#name":"keyword"  "$":{"id":"k0105"}  "$$":[{"#name":"text"  "_":"International Classification of Diseases – Version 9  SVM-Margin"}  {"#name":"keyword"  "$":{"id":"k0115"}  "$$":[{"#name":"text"  "_":"Support Vector Machines-Margin Method – an existing AL method oriented towards acquiring informative conditions that lie closest to the separating hyperplane (inside the margin)    Exploitation"}  {"#name":"keyword"  "$":{"id":"k0125"}  "$$":[{"#name":"text"  "_":"an AL method included in the CAESAR-ALE framework that is oriented towards acquisition of severe conditions    Combination_XA"}  {"#name":"keyword"  "$":{"id":"k0135"}  "$$":[{"#name":"text"  "_":"an AL method included in the CAESAR-ALE framework that combines elements of the Exploitation method and the SVM-Margin method  so that it applies a hybrid acquisition strategy for enhanced improvement of the CAESER method
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号