首页 | 本学科首页   官方微博 | 高级检索  
检索        


Finding falls in ambulatory care clinical documents using statistical text mining
Authors:James A McCart  Donald J Berndt  Jay Jarman  Dezon K Finch  Stephen L Luther
Institution:1.Consortium for Healthcare Informatics Research (CHIR) and the HSR&D/RR&D Center of Excellence: Maximizing Rehabilitation Outcomes, James A Haley Veterans’ Hospital, Tampa, Florida, USA;2.Consortium for Healthcare Informatics Research (CHIR) and the University of South Florida College of Business, Tampa, Florida, USA;3.Consortium for Healthcare Informatics Research (CHIR) and East Tennessee State University College of Business and Technology, Johnson City, Tennessee, USA
Abstract:

Objective

To determine how well statistical text mining (STM) models can identify falls within clinical text associated with an ambulatory encounter.

Materials and Methods

2241 patients were selected with a fall-related ICD-9-CM E-code or matched injury diagnosis code while being treated as an outpatient at one of four sites within the Veterans Health Administration. All clinical documents within a 48-h window of the recorded E-code or injury diagnosis code for each patient were obtained (n=26 010; 611 distinct document titles) and annotated for falls. Logistic regression, support vector machine, and cost-sensitive support vector machine (SVM-cost) models were trained on a stratified sample of 70% of documents from one location (dataset Atrain) and then applied to the remaining unseen documents (datasets Atest–D).

Results

All three STM models obtained area under the receiver operating characteristic curve (AUC) scores above 0.950 on the four test datasets (Atest–D). The SVM-cost model obtained the highest AUC scores, ranging from 0.953 to 0.978. The SVM-cost model also achieved F-measure values ranging from 0.745 to 0.853, sensitivity from 0.890 to 0.931, and specificity from 0.877 to 0.944.

Discussion

The STM models performed well across a large heterogeneous collection of document titles. In addition, the models also generalized across other sites, including a traditionally bilingual site that had distinctly different grammatical patterns.

Conclusions

The results of this study suggest STM-based models have the potential to improve surveillance of falls. Furthermore, the encouraging evidence shown here that STM is a robust technique for mining clinical documents bodes well for other surveillance-related topics.
Keywords:Text Mining  Accidental Falls  Electronic Health Records  Ambulatory Care
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号