首页 | 本学科首页   官方微博 | 高级检索  
     


Natural language processing to identify social determinants of health in Alzheimer's disease and related dementia from electronic health records
Authors:Wenbo Wu PhD  Kaes J. Holkeboer  Temidun O. Kolawole  Lorrie Carbone LMSW  Elham Mahmoudi PhD
Affiliation:1. Departments of Population Health and Medicine, Grossman School of Medicine, New York University, New York City, New York, USA;2. Department of Family Medicine, University of Michigan Medical School, Ann Arbor, Michigan, USA

College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, Michigan, USA;3. Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, Maryland, USA;4. Department of Family Medicine, University of Michigan Medical School, Ann Arbor, Michigan, USA

Abstract:

Objective

To develop a natural language processing (NLP) algorithm that identifies social determinants of health (SDoH), including housing, transportation, food, and medication insecurities, social isolation, abuse, neglect, or exploitation, and financial difficulties for patients with Alzheimer's disease and related dementias (ADRD) from unstructured electronic health records (EHRs).

Data Sources and Study Setting

We leveraged 1000 medical notes randomly selected from 7401 emergency department and inpatient social worker notes generated between 2015 and 2019 for 231 unique patients diagnosed with ADRD at Michigan Medicine.

Study Design

We developed a rule-based NLP algorithm for the identification of seven domains of SDoH noted above. We also compared the rule-based algorithm with deep learning and regularized logistic regression approaches. These models were compared using accuracy, sensitivity, specificity, F1 score, and the area under the receiver operating characteristic curve (AUC). All notes were split into 700 notes for training NLP algorithms, and 300 notes for validation.

Data Collection/Extraction Methods

Social worker notes used in this study were extracted from the Michigan Medicine EHR database.

Principal Findings

Of the 700 notes for training, F1 and AUC for the rule-based algorithm were at least 0.94 and 0.95, respectively, for all SDoH categories. Of the 300 notes for validation, F1 and AUC were at least 0.80 and 0.97, respectively, for all SDoH except housing and medication insecurities. The deep learning and regularized logistic regression algorithms had unsatisfactory performance.

Conclusions

The rule-based algorithm can accurately extract SDoH information in all seven domains of SDoH except housing and medication insecurities. Findings from the algorithm can be used by clinicians and social workers to proactively address social needs of patients with ADRD and other vulnerable patient populations.
Keywords:Alzheimer's disease and related dementia  electronic health records  machine learning  natural language processing  social determinants of health
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号