Analysis of Stroke Detection during the COVID-19 Pandemic Using Natural Language Processing of Radiology Reports |
| |
Authors: | M.D. Li M. Lang F. Deng K. Chang K. Buch S. Rincon W.A. Mehan T.M. Leslie-Mazwi J. Kalpathy-Cramer |
| |
Affiliation: | aFrom the Departments of Radiology (M.D.L., M.L., F.D., K.C., K.B., S.R., W.A.M., J.K.-C.);bNeurology and Neurosurgery (T.M.L.-M.), Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts |
| |
Abstract: | BACKGROUND AND PURPOSE:The coronavirus disease 2019 (COVID-19) pandemic has led to decreases in neuroimaging volume. Our aim was to quantify the change in acute or subacute ischemic strokes detected on CT or MR imaging during the pandemic using natural language processing of radiology reports.MATERIALS AND METHODS:We retrospectively analyzed 32,555 radiology reports from brain CTs and MRIs from a comprehensive stroke center, performed from March 1 to April 30 each year from 2017 to 2020, involving 20,414 unique patients. To detect acute or subacute ischemic stroke in free-text reports, we trained a random forest natural language processing classifier using 1987 randomly sampled radiology reports with manual annotation. Natural language processing classifier generalizability was evaluated using 1974 imaging reports from an external dataset.RESULTS:The natural language processing classifier achieved a 5-fold cross-validation classification accuracy of 0.97 and an F1 score of 0.74, with a slight underestimation (−5%) of actual numbers of acute or subacute ischemic strokes in cross-validation. Importantly, cross-validation performance stratified by year was similar. Applying the classifier to the complete study cohort, we found an estimated 24% decrease in patients with acute or subacute ischemic strokes reported on CT or MR imaging from March to April 2020 compared with the average from those months in 2017–2019. Among patients with stroke-related order indications, the estimated proportion who underwent neuroimaging with acute or subacute ischemic stroke detection significantly increased from 16% during 2017–2019 to 21% in 2020 (P = .01). The natural language processing classifier performed worse on external data.CONCLUSIONS:Acute or subacute ischemic stroke cases detected by neuroimaging decreased during the COVID-19 pandemic, though a higher proportion of studies ordered for stroke were positive for acute or subacute ischemic strokes. Natural language processing approaches can help automatically track acute or subacute ischemic stroke numbers for epidemiologic studies, though local classifier training is important due to radiologist reporting style differences.There is much concern regarding the impact of the coronavirus disease 2019 (COVID-19) pandemic on the quality of stroke care, including issues with hospital capacity, clinical resource re-allocation, and the safety of patients and clinicians.1,2 Previous reports have shown that there have been substantial decreases in stroke neuroimaging volume during the pandemic.3,4 In addition, acute ischemic infarcts have been found on neuroimaging studies in many hospitalized patients with COVID-19, though the causal relationship is unclear.5,6 Studies like these and other epidemiologic analyses usually rely on the creation of manually curated databases, in which identification of cases can be time-consuming and difficult to update in real-time. One way to facilitate such research is to use natural language processing (NLP), which has shown utility for automated analysis of radiology report data.7 NLP algorithms have been developed previously for the classification of neuroradiology reports for the presence of ischemic stroke findings and acute ischemic stroke subtypes.8,9 Thus, NLP has the potential to facilitate COVID-19 research.In this study, we developed an NLP machine learning model that classifies radiology reports for the presence or absence of acute or subacute ischemic stroke (ASIS), as opposed to chronic stroke. We used this model to quantify the change in ASIS detected on all CT or MR imaging studies performed at a large comprehensive stroke center during the COVID-19 pandemic in the United States. We also evaluated NLP model generalizability and different training strategies using a sample of radiology reports from a second stroke center. |
| |
Keywords: | |
|
|