首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The authors organized a Natural Language Processing (NLP) challenge on automatically determining the smoking status of patients from information found in their discharge records. This challenge was issued as a part of the i2b2 (Informatics for Integrating Biology to the Bedside) project, to survey, facilitate, and examine studies in medical language understanding for clinical narratives. This article describes the smoking challenge, details the data and the annotation process, explains the evaluation metrics, discusses the characteristics of the systems developed for the challenge, presents an analysis of the results of received system runs, draws conclusions about the state of the art, and identifies directions for future research. A total of 11 teams participated in the smoking challenge. Each team submitted up to three system runs, providing a total of 23 submissions. The submitted system runs were evaluated with microaveraged and macroaveraged precision, recall, and F-measure. The systems submitted to the smoking challenge represented a variety of machine learning and rule-based algorithms. Despite the differences in their approaches to smoking status identification, many of these systems provided good results. There were 12 system runs with microaveraged F-measures above 0.84. Analysis of the results highlighted the fact that discharge summaries express smoking status using a limited number of textual features (e.g., “smok”, “tobac”, “cigar”, Social History, etc.). Many of the effective smoking status identifiers benefit from these features.  相似文献   

2.
We participated in the i2b2 smoking status classification challenge task. The purpose of this task was to evaluate the ability of systems to automatically identify patient smoking status from discharge summaries. Our submission included several techniques that we compared and studied, including hot-spot identification, zero-vector filtering, inverse class frequency weighting, error-correcting output codes, and post-processing rules. We evaluated our approaches using the same methods as the i2b2 task organizers, using micro- and macro-averaged F1 as the primary performance metric. Our best performing system achieved a micro-F1 of 0.9000 on the test collection, equivalent to the best performing system submitted to the i2b2 challenge. Hot-spot identification, zero-vector filtering, classifier weighting, and error correcting output coding contributed additively to increased performance, with hot-spot identification having by far the largest positive effect. High performance on automatic identification of patient smoking status from discharge summaries is achievable with the efficient and straightforward machine learning techniques studied here.  相似文献   

3.

Objective

To evaluate the validity of, characterize the usage of, and propose potential research applications for International Classification of Diseases, Ninth Revision (ICD-9) tobacco codes in clinical populations.

Materials and methods

Using data on cancer cases and cancer-free controls from Vanderbilt''s biorepository, BioVU, we evaluated the utility of ICD-9 tobacco use codes to identify ever-smokers in general and high smoking prevalence (lung cancer) clinic populations. We assessed potential biases in documentation, and performed temporal analysis relating transitions between smoking codes to smoking cessation attempts. We also examined the suitability of these codes for use in genetic association analyses.

Results

ICD-9 tobacco use codes can identify smokers in a general clinic population (specificity of 1, sensitivity of  0.32), and there is little evidence of documentation bias. Frequency of code transitions between ‘current’ and ‘former’ tobacco use was significantly correlated with initial success at smoking cessation (p<0.0001). Finally, code-based smoking status assignment is a comparable covariate to text-based smoking status for genetic association studies.

Discussion

Our results support the use of ICD-9 tobacco use codes for identifying smokers in a clinical population. Furthermore, with some limitations, these codes are suitable for adjustment of smoking status in genetic studies utilizing electronic health records.

Conclusions

Researchers should not be deterred by the unavailability of full-text records to determine smoking status if they have ICD-9 code histories.  相似文献   

4.

Objectives

Natural language processing (NLP) applications typically use regular expressions that have been developed manually by human experts. Our goal is to automate both the creation and utilization of regular expressions in text classification.

Methods

We designed a novel regular expression discovery (RED) algorithm and implemented two text classifiers based on RED. The RED+ALIGN classifier combines RED with an alignment algorithm, and RED+SVM combines RED with a support vector machine (SVM) classifier. Two clinical datasets were used for testing and evaluation: the SMOKE dataset, containing 1091 text snippets describing smoking status; and the PAIN dataset, containing 702 snippets describing pain status. We performed 10-fold cross-validation to calculate accuracy, precision, recall, and F-measure metrics. In the evaluation, an SVM classifier was trained as the control.

Results

The two RED classifiers achieved 80.9–83.0% in overall accuracy on the two datasets, which is 1.3–3% higher than SVM''s accuracy (p<0.001). Similarly, small but consistent improvements have been observed in precision, recall, and F-measure when RED classifiers are compared with SVM alone. More significantly, RED+ALIGN correctly classified many instances that were misclassified by the SVM classifier (8.1–10.3% of the total instances and 43.8–53.0% of SVM''s misclassifications).

Conclusions

Machine-generated regular expressions can be effectively used in clinical text classification. The regular expression-based classifier can be combined with other classifiers, like SVM, to improve classification performance.  相似文献   

5.

Objective

To describe a system for determining the assertion status of medical problems mentioned in clinical reports, which was entered in the 2010 i2b2/VA community evaluation ‘Challenges in natural language processing for clinical data’ for the task of classifying assertions associated with problem concepts extracted from patient records.

Materials and methods

A combination of machine learning (conditional random field and maximum entropy) and rule-based (pattern matching) techniques was used to detect negation, speculation, and hypothetical and conditional information, as well as information associated with persons other than the patient.

Results

The best submission obtained an overall micro-averaged F-score of 0.9343.

Conclusions

Using semantic attributes of concepts and information about document structure as features for statistical classification of assertions is a good way to leverage rule-based and statistical techniques. In this task, the choice of features may be more important than the choice of classifier algorithm.  相似文献   

6.
The American health care system is one of the world''s largest and most complex industries. The Health Care Financing Administration reports that 1997 expenditures for health care exceeded one trillion dollars, or 13.5 percent of the gross domestic product. Despite these expenditures, over 16 percent of the U.S. population remains uninsured, and a large percentage of patients express dissatisfaction with the health care system. Managed care, effective in its ability to attenuate the rate of cost increase, is associated with a concomitant degree of administrative overhead that is often perceived by providers and patients alike as a major source of cost and inconvenience. Both providers and patients sense a great degree of inconvenience and an excessive amount of paperwork associated with both the process of seeking medical care and the subsequent process of paying for medical services.Traditionally, health practitioners have sought a return to traditional fee-for-service payment to mitigate the inconvenience associated with managed care. More populist proposals include universal health insurance or mandatory enrollment in health maintenance organizations. Advocates of managed care argue that the business methods required for effective trials of this approach are only beginning to be realized. By all accounts, information technology is a necessary part of these initiatives, but there is universal consensus that our current systems are inadequate to the task. (Oxford Health System''s difficulties in 1998, for example, have been attributed in part to inadequate deployment of information technology.) To this author, the model for the current generation of health care information systems is strikingly similar to that for the information systems employed by the Internal Revenue Service. In each case, the system allows for low-cost changes to administrative code brought about by legislation, but in both cases the “ripple effects” of additional complexity and administrative burden far exceed the cost of immediate change. To paraphrase a quotation attributed to Major Richard Dailey, made about his police force during the 1998 Chicago Democratic Convention, our information systems “are not here to create disorder; they are here to preserve disorder.”This case explores one alternative source for models in health care delivery. Through an examination of a typical patient experience, we explore Porter''s notion of the value chain and “just-in-time” logistics common to successful organizations like Wal-Mart and Amazon.com (see Suggested Readings). We close with a brief discussion of how these logistics and inventory systems apply to health care. Clearly, logistics are important in patient care, accounts receivable are a cause of severe working capital problems in health care, and the logistics of caring for patients are becoming more complex. But the concepts we discuss have an even greater importance: Effective management of these issues through information technology may restore our most precious commodity—time.  相似文献   

7.
Electronic medical records are increasingly used to store patient information in hospitals and other clinical settings. There has been a corresponding proliferation of clinical natural language processing (cNLP) systems aimed at using text data in these records to improve clinical decision-making, in comparison to manual clinician search and clinical judgment alone. However, these systems have delivered marginal practical utility and are rarely deployed into healthcare settings, leading to proposals for technical and structural improvements. In this paper, we argue that this reflects a violation of Friedman’s “Fundamental Theorem of Biomedical Informatics,” and that a deeper epistemological change must occur in the cNLP field, as a parallel step alongside any technical or structural improvements. We propose that researchers shift away from designing cNLP systems independent of clinical needs, in which cNLP tasks are ends in themselves—“tasks as decisions”—and toward systems that are directly guided by the needs of clinicians in realistic decision-making contexts—“tasks as needs.” A case study example illustrates the potential benefits of developing cNLP systems that are designed to more directly support clinical needs.  相似文献   

8.
We aim to build and evaluate an open-source natural language processing system for information extraction from electronic medical record clinical free-text. We describe and evaluate our system, the clinical Text Analysis and Knowledge Extraction System (cTAKES), released open-source at http://www.ohnlp.org. The cTAKES builds on existing open-source technologies—the Unstructured Information Management Architecture framework and OpenNLP natural language processing toolkit. Its components, specifically trained for the clinical domain, create rich linguistic and semantic annotations. Performance of individual components: sentence boundary detector accuracy=0.949; tokenizer accuracy=0.949; part-of-speech tagger accuracy=0.936; shallow parser F-score=0.924; named entity recognizer and system-level evaluation F-score=0.715 for exact and 0.824 for overlapping spans, and accuracy for concept mapping, negation, and status attributes for exact and overlapping spans of 0.957, 0.943, 0.859, and 0.580, 0.939, and 0.839, respectively. Overall performance is discussed against five applications. The cTAKES annotations are the foundation for methods and modules for higher-level semantic processing of clinical free-text.  相似文献   

9.

Objective

Pathology reports are rich in narrative statements that encode a complex web of relations among medical concepts. These relations are routinely used by doctors to reason on diagnoses, but often require hand-crafted rules or supervised learning to extract into prespecified forms for computational disease modeling. We aim to automatically capture relations from narrative text without supervision.

Methods

We design a novel framework that translates sentences into graph representations, automatically mines sentence subgraphs, reduces redundancy in mined subgraphs, and automatically generates subgraph features for subsequent classification tasks. To ensure meaningful interpretations over the sentence graphs, we use the Unified Medical Language System Metathesaurus to map token subsequences to concepts, and in turn sentence graph nodes. We test our system with multiple lymphoma classification tasks that together mimic the differential diagnosis by a pathologist. To this end, we prevent our classifiers from looking at explicit mentions or synonyms of lymphomas in the text.

Results and Conclusions

We compare our system with three baseline classifiers using standard n-grams, full MetaMap concepts, and filtered MetaMap concepts. Our system achieves high F-measures on multiple binary classifications of lymphoma (Burkitt lymphoma, 0.8; diffuse large B-cell lymphoma, 0.909; follicular lymphoma, 0.84; Hodgkin lymphoma, 0.912). Significance tests show that our system outperforms all three baselines. Moreover, feature analysis identifies subgraph features that contribute to improved performance; these features agree with the state-of-the-art knowledge about lymphoma classification. We also highlight how these unsupervised relation features may provide meaningful insights into lymphoma classification.  相似文献   

10.
Introduction: We evaluated the mental health status of children residing in Kawauchi village (Kawauchi), Fukushima Prefecture, after the 2011 accident at the Fukushima Daiichi Nuclear Power Station, based on the children’s experience of the nuclear disaster. Methods: We conducted this cross-sectional study within the framework of the Fukushima Health Management Survey (FHMS); FHMS data on age, sex, exercise habits, sleeping times, experience of the nuclear disaster, and the “Strengths and Difficulties Questionnaire (SDQ)” scores for 156 children from Kawauchi in 2012 were collected. Groups with and without experience of the nuclear disaster — “nuclear disaster (+)” and “nuclear disaster (−)” — were also compared. Results: Our effective response was 93 (59.6%); the mean SDQ score was 11.4±6.8 among elementary school-aged participants and 12.4±6.8 among junior high school-aged ones. We statistically compared the Total Difficulties Scores (TDS) and sub-item scores of the SDQ between “elementary school” and “junior high school” or “nuclear disaster” (+) and (−). There was no significant difference between these items. Conclusions: We found indications of poor mental health among elementary and junior high school-aged children in the disaster area immediately following the accident, but no differences based on their experience of the nuclear disaster. These results indicate the possibility of triggering stress, separate to that from experiences related to the nuclear disaster, in children who lived in affected rural areas and were evacuated just after the nuclear disaster.  相似文献   

11.

Background

Word sense disambiguation (WSD) methods automatically assign an unambiguous concept to an ambiguous term based on context, and are important to many text-processing tasks. In this study we developed and evaluated a knowledge-based WSD method that uses semantic similarity measures derived from the Unified Medical Language System (UMLS) and evaluated the contribution of WSD to clinical text classification.

Methods

We evaluated our system on biomedical WSD datasets and determined the contribution of our WSD system to clinical document classification on the 2007 Computational Medicine Challenge corpus.

Results

Our system compared favorably with other knowledge-based methods. Machine learning classifiers trained on disambiguated concepts significantly outperformed those trained using all concepts.

Conclusions

We developed a WSD system that achieves high disambiguation accuracy on standard biomedical WSD datasets and showed that our WSD system improves clinical document classification.

Data sharing

We integrated our WSD system with MetaMap and the clinical Text Analysis and Knowledge Extraction System, two popular biomedical natural language processing systems. All codes required to reproduce our results and all tools developed as part of this study are released as open source, available under http://code.google.com/p/ytex.  相似文献   

12.

Objective

Information extraction and classification of clinical data are current challenges in natural language processing. This paper presents a cascaded method to deal with three different extractions and classifications in clinical data: concept annotation, assertion classification and relation classification.

Materials and Methods

A pipeline system was developed for clinical natural language processing that includes a proofreading process, with gold-standard reflexive validation and correction. The information extraction system is a combination of a machine learning approach and a rule-based approach. The outputs of this system are used for evaluation in all three tiers of the fourth i2b2/VA shared-task and workshop challenge.

Results

Overall concept classification attained an F-score of 83.3% against a baseline of 77.0%, the optimal F-score for assertions about the concepts was 92.4% and relation classifier attained 72.6% for relationships between clinical concepts against a baseline of 71.0%. Micro-average results for the challenge test set were 81.79%, 91.90% and 70.18%, respectively.

Discussion

The challenge in the multi-task test requires a distribution of time and work load for each individual task so that the overall performance evaluation on all three tasks would be more informative rather than treating each task assessment as independent. The simplicity of the model developed in this work should be contrasted with the very large feature space of other participants in the challenge who only achieved slightly better performance. There is a need to charge a penalty against the complexity of a model as defined in message minimalisation theory when comparing results.

Conclusion

A complete pipeline system for constructing language processing models that can be used to process multiple practical detection tasks of language structures of clinical records is presented.  相似文献   

13.
ObjectiveAdherence to a treatment plan from HIV-positive patients is necessary to decrease their mortality and improve their quality of life, however some patients display poor appointment adherence and become lost to follow-up (LTFU). We applied natural language processing (NLP) to analyze indications towards or against LTFU in HIV-positive patients’ notes.Materials and MethodsUnstructured lemmatized notes were labeled with an LTFU or Retained status using a 183-day threshold. An NLP and supervised machine learning system with a linear model and elastic net regularization was trained to predict this status. Prevalence of characteristics domains in the learned model weights were evaluated.ResultsWe analyzed 838 LTFU vs 2964 Retained notes and obtained a weighted F1 mean of 0.912 via nested cross-validation; another experiment with notes from the same patients in both classes showed substantially lower metrics. “Comorbidities” were associated with LTFU through, for instance, “HCV” (hepatitis C virus) and likewise “Good adherence” with Retained, represented with “Well on ART” (antiretroviral therapy).DiscussionMentions of mental health disorders and substance use were associated with disparate retention outcomes, however history vs active use was not investigated. There remains further need to model transitions between LTFU and being retained in care over time.ConclusionWe provided an important step for the future development of a model that could eventually help to identify patients who are at risk for falling out of care and to analyze which characteristics could be factors for this. Further research is needed to enhance this method with structured electronic medical record fields.  相似文献   

14.
We show that Bayesian methods can be efficiently applied to the classification of otoneurological diseases and to assess attribute dependencies. A set of 38 otoneurological attributes was employed in order to use a naïve Bayesian probabilistic model and Bayesian networks with different scoring functions for the classification of cases from six otoneurological diseases. Tests were executed on the basis of tenfold crossvalidation. We obtained average sensitivities of 90%, positive predictive values of 92% and accuracies as high as 97%, which is better than our earlier tests with neural networks. Our assessments indicated that Bayesian methods have good power and potential to classify otoneurological patient cases correctly even if this is often a complicated task for the best specialists. Bayesian methods classified the current medical data and knowledge well.  相似文献   

15.
Obesity is a chronic disease with an increasing impact on the world’s population. In this work, we present a method of identifying obesity automatically using text mining techniques and information related to body weight measures and obesity comorbidities. We used a dataset of 3015 de-identified medical records that contain labels for two classification problems. The first classification problem distinguishes between obesity, overweight, normal weight, and underweight. The second classification problem differentiates between obesity types: super obesity, morbid obesity, severe obesity and moderate obesity. We used a Bag of Words approach to represent the records together with unigram and bigram representations of the features. We implemented two approaches: a hierarchical method and a nonhierarchical one. We used Support Vector Machine and Naïve Bayes together with ten-fold cross validation to evaluate and compare performances. Our results indicate that the hierarchical approach does not work as well as the nonhierarchical one. In general, our results show that Support Vector Machine obtains better performances than Naïve Bayes for both classification problems. We also observed that bigram representation improves performance compared with unigram representation.  相似文献   

16.
ObjectiveTo develop an algorithm for building longitudinal medication dose datasets using information extracted from clinical notes in electronic health records (EHRs).Materials and MethodsWe developed an algorithm that converts medication information extracted using natural language processing (NLP) into a usable format and builds longitudinal medication dose datasets. We evaluated the algorithm on 2 medications extracted from clinical notes of Vanderbilt’s EHR and externally validated the algorithm using clinical notes from the MIMIC-III clinical care database.ResultsFor the evaluation using Vanderbilt’s EHR data, the performance of our algorithm was excellent; F1-measures were ≥0.98 for both dose intake and daily dose. For the external validation using MIMIC-III, the algorithm achieved F1-measures ≥0.85 for dose intake and ≥0.82 for daily dose.DiscussionOur algorithm addresses the challenge of building longitudinal medication dose data using information extracted from clinical notes. Overall performance was excellent, but the algorithm can perform poorly when incorrect information is extracted by NLP systems. Although it performed reasonably well when applied to the external data source, its performance was worse due to differences in the way the drug information was written. The algorithm is implemented in the R package, “EHR,” and the extracted data from Vanderbilt’s EHRs along with the gold standards are provided so that users can reproduce the results and help improve the algorithm.ConclusionOur algorithm for building longitudinal dose data provides a straightforward way to use EHR data for medication-based studies. The external validation results suggest its potential for applicability to other systems.  相似文献   

17.
目的:构建基于麻醉与围术期学科标准医疗术语数据的标注平台,实现非结构化医学文本数据在专科数据平台集中展示和应用。方法:以麻醉与围术期医学科重点关注的医学指标作为标准数据元创建任务,采用Java、Vue等前后端编程语言,结合Axios、Vue-router、医学术语知识库等相关技术建立系统基础平台架构。将电子病历系统及手术麻醉系统非结构化数据导入基础平台,在Web端页面随机分配给专业医学标注人员按文本节点进行术语标注,提供可供机器学习的标准化样例数据。结果:基于非结构化文本数据搭建的数据标注平台主菜单主要由任务中心和管理中心构成。其中指标列表,标注文本和结果列表组成平台任务中心供人工标注,人员管理、任务管理、数据维护、标注数据元覆盖率统计及结果导出组成平台管理中心进行平台运维。结论:麻醉与围术期医学科数据标注平台实现了非结构化文本数据的人工标注与机器学习,为数据平台全量数据展示提供了转化中台。  相似文献   

18.

Objective

To examine the impact of billing and clinical data extracted from an electronic medical record system on the calculation of an adverse drug event (ADE) quality measure approved for use in The Joint Commission''s ORYX program, a mandatory national hospital quality reporting system.

Design

The Child Health Corporation of America''s “Use of Rescue Agents—ADE Trigger” quality measure uses medication billing data contained in the Pediatric Health Information Systems (PHIS) data warehouse to create The Joint Commission-approved quality measure. Using a similar query, we calculated the quality measure using PHIS plus four data sources extracted from our electronic medical record (EMR) system: medications charged, medication orders placed, medication orders with associated charges (orders charged), and medications administered.

Measurements

Inclusion and exclusion criteria were identical for all queries. Denominators and numerators were calculated using the five data sets. The reported quality measure is the ADE rate (numerator/denominator).

Results

Significant differences in denominators, numerators, and rates were calculated from different data sources within a single institution''s EMR. Differences were due to both common clinical practices that may be similar across institutions and unique workflow practices not likely to be present at any other institution. The magnitude of the differences would significantly alter the national comparative ranking of our institution compared to other PHIS institutions.

Conclusions

More detailed clinical information may result in quality measures that are not comparable across institutions due institution-specific workflow, differences that are exposed using EMR-derived data.  相似文献   

19.

Background and objective

As people increasingly engage in online health-seeking behavior and contribute to health-oriented websites, the volume of medical text authored by patients and other medical novices grows rapidly. However, we lack an effective method for automatically identifying medical terms in patient-authored text (PAT). We demonstrate that crowdsourcing PAT medical term identification tasks to non-experts is a viable method for creating large, accurately-labeled PAT datasets; moreover, such datasets can be used to train classifiers that outperform existing medical term identification tools.

Materials and methods

To evaluate the viability of using non-expert crowds to label PAT, we compare expert (registered nurses) and non-expert (Amazon Mechanical Turk workers; Turkers) responses to a PAT medical term identification task. Next, we build a crowd-labeled dataset comprising 10 000 sentences from MedHelp. We train two models on this dataset and evaluate their performance, as well as that of MetaMap, Open Biomedical Annotator (OBA), and NaCTeM''s TerMINE, against two gold standard datasets: one from MedHelp and the other from CureTogether.

Results

When aggregated according to a corroborative voting policy, Turker responses predict expert responses with an F1 score of 84%. A conditional random field (CRF) trained on 10 000 crowd-labeled MedHelp sentences achieves an F1 score of 78% against the CureTogether gold standard, widely outperforming OBA (47%), TerMINE (43%), and MetaMap (39%). A failure analysis of the CRF suggests that misclassified terms are likely to be either generic or rare.

Conclusions

Our results show that combining statistical models sensitive to sentence-level context with crowd-labeled data is a scalable and effective technique for automatically identifying medical terms in PAT.  相似文献   

20.
林琳 《中国数字医学》2014,(4):62-63,68
电子病历改变了病历载体的存储方式,创新了病案管理模式,其中电子首页的疾病诊断和手术操作信息正确填写成为关注的焦点。手术分类ICD-9-CM3是专业知识、技术性极强的工作,要求熟悉分类编码规则。分析了临床医师在医生工作站填写首页的手术信息,由于跨专业的限制和复杂多样临床情况等因素,常出现各种填写错误的情况,降低了电子病历数据分析与挖掘的准确性,提出只有依靠信息技术,建立基于知识库的手术操作分类(ICD-9-CM3)系统,设计手术操作编码知识库的系统结构及操作流程,对手术信息的填写进行核查、提示和指导,实现正确填写手术信息,达到不断减少错误,提高病案信息资源的利用。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号