Evaluation of an Automated Information Extraction Tool for Imaging Data Elements to Populate a Breast Cancer Screening Registry |
| |
Authors: | Ronilda Lacson Kimberly Harris Phyllis Brawarsky Tor D Tosteson Tracy Onega Anna N A Tosteson Abby Kaye Irina Gonzalez Robyn Birdwell Jennifer S Haas |
| |
Institution: | 1. Department of Radiology, Brigham and Women’s Hospital, 75 Francis Street, Boston, MA, 02115, USA 3. Harvard Medical School, Boston, MA, USA 2. Department of General Internal Medicine and Primary Care, Brigham and Women’s Hospital, Boston, MA, USA 4. Department of Community and Family Medicine, The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA 5. Department of Medicine, The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
|
| |
Abstract: | Breast cancer screening is central to early breast cancer detection. Identifying and monitoring process measures for screening is a focus of the National Cancer Institute’s Population-based Research Optimizing Screening through Personalized Regimens (PROSPR) initiative, which requires participating centers to report structured data across the cancer screening continuum. We evaluate the accuracy of automated information extraction of imaging findings from radiology reports, which are available as unstructured text. We present prevalence estimates of imaging findings for breast imaging received by women who obtained care in a primary care network participating in PROSPR (n?=?139,953 radiology reports) and compared automatically extracted data elements to a “gold standard” based on manual review for a validation sample of 941 randomly selected radiology reports, including mammograms, digital breast tomosynthesis, ultrasound, and magnetic resonance imaging (MRI). The prevalence of imaging findings vary by data element and modality (e.g., suspicious calcification noted in 2.6 % of screening mammograms, 12.1 % of diagnostic mammograms, and 9.4 % of tomosynthesis exams). In the validation sample, the accuracy of identifying imaging findings, including suspicious calcifications, masses, and architectural distortion (on mammogram and tomosynthesis); masses, cysts, non-mass enhancement, and enhancing foci (on MRI); and masses and cysts (on ultrasound), range from 0.8 to1.0 for recall, precision, and F-measure. Information extraction tools can be used for accurate documentation of imaging findings as structured data elements from text reports for a variety of breast imaging modalities. These data can be used to populate screening registries to help elucidate more effective breast cancer screening processes. |
| |
Keywords: | BI-RADS Breast Data extraction Information storage and retrieval Natural language processing |
本文献已被 SpringerLink 等数据库收录! |
|