首页 | 本学科首页   官方微博 | 高级检索  
检索        


Machine learning-based coreference resolution of concepts in clinical documents
Authors:Ware Henry  Mullett Charles J  Jagannathan Vasudevan  El-Rawas Oussama
Institution:M*Modal, Inc., Morgantown, West Virginia, USA.
Abstract:

Objective

Coreference resolution of concepts, although a very active area in the natural language processing community, has not yet been widely applied to clinical documents. Accordingly, the 2011 i2b2 competition focusing on this area is a timely and useful challenge. The objective of this research was to collate coreferent chains of concepts from a corpus of clinical documents. These concepts are in the categories of person, problems, treatments, and tests.

Design

A machine learning approach based on graphical models was employed to cluster coreferent concepts. Features selected were divided into domain independent and domain specific sets. Training was done with the i2b2 provided training set of 489 documents with 6949 chains. Testing was done on 322 documents.

Results

The learning engine, using the un-weighted average of three different measurement schemes, resulted in an F measure of 0.8423 where no domain specific features were included and 0.8483 where the feature set included both domain independent and domain specific features.

Conclusion

Our machine learning approach is a promising solution for recognizing coreferent concepts, which in turn is useful for practical applications such as the assembly of problem and medication lists from clinical documents.
Keywords:Coreference resolution  machine learning  NLP  natural language processing  text mining  quality measures  CDI
本文献已被 PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号