首页 | 本学科首页   官方微博 | 高级检索  
     


Automatic classification of mammography reports by BI-RADS breast tissue composition class
Authors:Percha Bethany  Nassif Houssam  Lipson Jafi  Burnside Elizabeth  Rubin Daniel
Affiliation:Biomedical Informatics Program, Stanford University, Stanford, California 94305-5488, USA.
Abstract:
Because breast tissue composition partially predicts breast cancer risk, classification of mammography reports by breast tissue composition is important from both a scientific and clinical perspective. A method is presented for using the unstructured text of mammography reports to classify them into BI-RADS breast tissue composition categories. An algorithm that uses regular expressions to automatically determine BI-RADS breast tissue composition classes for unstructured mammography reports was developed. The algorithm assigns each report to a single BI-RADS composition class: 'fatty', 'fibroglandular', 'heterogeneously dense', 'dense', or 'unspecified'. We evaluated its performance on mammography reports from two different institutions. The method achieves >99% classification accuracy on a test set of reports from the Marshfield Clinic (Wisconsin) and Stanford University. Since large-scale studies of breast cancer rely heavily on breast tissue composition information, this method could facilitate this research by helping mine large datasets to correlate breast composition with other covariates.
Keywords:Mammography   natural language processing   data mining   radiology   breast   text mining   machine learning   pharmacogenomics   breast imaging   quality   outcomes research
本文献已被 PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号