首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
2.

Objective

We aim to identify duplicate pairs of Medline citations, particularly when the documents are not identical but contain similar information.

Materials and methods

Duplicate pairs of citations are identified by comparing word n-grams in pairs of documents. N-grams are modified using two approaches which take account of the fact that the document may have been altered. These are: (1) deletion, an item in the n-gram is removed; and (2) substitution, an item in the n-gram is substituted with a similar term obtained from the Unified Medical Language System  Metathesaurus. N-grams are also weighted using a score derived from a language model. Evaluation is carried out using a set of 520 Medline citation pairs, including a set of 260 manually verified duplicate pairs obtained from the Deja Vu database.

Results

The approach accurately detects duplicate Medline document pairs with an F1 measure score of 0.99. Allowing for word deletions and substitution improves performance. The best results are obtained by combining scores for n-grams of length 1–5 words.

Discussion

Results show that the detection of duplicate Medline citations can be improved by modifying n-grams and that high performance can also be obtained using only unigrams (F1=0.959), particularly when allowing for substitutions of alternative phrases.  相似文献   

3.

Objective

To specify the problem of patient-level temporal aggregation from clinical text and introduce several probabilistic methods for addressing that problem. The patient-level perspective differs from the prevailing natural language processing (NLP) practice of evaluating at the term, event, sentence, document, or visit level.

Methods

We utilized an existing pediatric asthma cohort with manual annotations. After generating a basic feature set via standard clinical NLP methods, we introduce six methods of aggregating time-distributed features from the document level to the patient level. These aggregation methods are used to classify patients according to their asthma status in two hypothetical settings: retrospective epidemiology and clinical decision support.

Results

In both settings, solid patient classification performance was obtained with machine learning algorithms on a number of evidence aggregation methods, with Sum aggregation obtaining the highest F1 score of 85.71% on the retrospective epidemiological setting, and a probability density function-based method obtaining the highest F1 score of 74.63% on the clinical decision support setting. Multiple techniques also estimated the diagnosis date (index date) of asthma with promising accuracy.

Discussion

The clinical decision support setting is a more difficult problem. We rule out some aggregation methods rather than determining the best overall aggregation method, since our preliminary data set represented a practical setting in which manually annotated data were limited.

Conclusion

Results contrasted the strengths of several aggregation algorithms in different settings. Multiple approaches exhibited good patient classification performance, and also predicted the timing of estimates with reasonable accuracy.  相似文献   

4.
5.

Background

Although electronic health records (EHRs) have the potential to provide a foundation for quality and safety algorithms, few studies have measured their impact on automated adverse event (AE) and medical error (ME) detection within the neonatal intensive care unit (NICU) environment.

Objective

This paper presents two phenotyping AE and ME detection algorithms (ie, IV infiltrations, narcotic medication oversedation and dosing errors) and describes manual annotation of airway management and medication/fluid AEs from NICU EHRs.

Methods

From 753 NICU patient EHRs from 2011, we developed two automatic AE/ME detection algorithms, and manually annotated 11 classes of AEs in 3263 clinical notes. Performance of the automatic AE/ME detection algorithms was compared to trigger tool and voluntary incident reporting results. AEs in clinical notes were double annotated and consensus achieved under neonatologist supervision. Sensitivity, positive predictive value (PPV), and specificity are reported.

Results

Twelve severe IV infiltrates were detected. The algorithm identified one more infiltrate than the trigger tool and eight more than incident reporting. One narcotic oversedation was detected demonstrating 100% agreement with the trigger tool. Additionally, 17 narcotic medication MEs were detected, an increase of 16 cases over voluntary incident reporting.

Conclusions

Automated AE/ME detection algorithms provide higher sensitivity and PPV than currently used trigger tools or voluntary incident-reporting systems, including identification of potential dosing and frequency errors that current methods are unequipped to detect.  相似文献   

6.

Background

Electronic health record (EHR) users must regularly review large amounts of data in order to make informed clinical decisions, and such review is time-consuming and often overwhelming. Technologies like automated summarization tools, EHR search engines and natural language processing have been shown to help clinicians manage this information.

Objective

To develop a support vector machine (SVM)-based system for identifying EHR progress notes pertaining to diabetes, and to validate it at two institutions.

Materials and methods

We retrieved 2000 EHR progress notes from patients with diabetes at the Brigham and Women''s Hospital (1000 for training and 1000 for testing) and another 1000 notes from the University of Texas Physicians (for validation). We manually annotated all notes and trained a SVM using a bag of words approach. We then used the SVM on the testing and validation sets and evaluated its performance with the area under the curve (AUC) and F statistics.

Results

The model accurately identified diabetes-related notes in both the Brigham and Women''s Hospital testing set (AUC=0.956, F=0.934) and the external University of Texas Faculty Physicians validation set (AUC=0.947, F=0.935).

Discussion

Overall, the model we developed was quite accurate. Furthermore, it generalized, without loss of accuracy, to another institution with a different EHR and a distinct patient and provider population.

Conclusions

It is possible to use a SVM-based classifier to identify EHR progress notes pertaining to diabetes, and the model generalizes well.  相似文献   

7.

Objectives

Natural language processing (NLP) applications typically use regular expressions that have been developed manually by human experts. Our goal is to automate both the creation and utilization of regular expressions in text classification.

Methods

We designed a novel regular expression discovery (RED) algorithm and implemented two text classifiers based on RED. The RED+ALIGN classifier combines RED with an alignment algorithm, and RED+SVM combines RED with a support vector machine (SVM) classifier. Two clinical datasets were used for testing and evaluation: the SMOKE dataset, containing 1091 text snippets describing smoking status; and the PAIN dataset, containing 702 snippets describing pain status. We performed 10-fold cross-validation to calculate accuracy, precision, recall, and F-measure metrics. In the evaluation, an SVM classifier was trained as the control.

Results

The two RED classifiers achieved 80.9–83.0% in overall accuracy on the two datasets, which is 1.3–3% higher than SVM''s accuracy (p<0.001). Similarly, small but consistent improvements have been observed in precision, recall, and F-measure when RED classifiers are compared with SVM alone. More significantly, RED+ALIGN correctly classified many instances that were misclassified by the SVM classifier (8.1–10.3% of the total instances and 43.8–53.0% of SVM''s misclassifications).

Conclusions

Machine-generated regular expressions can be effectively used in clinical text classification. The regular expression-based classifier can be combined with other classifiers, like SVM, to improve classification performance.  相似文献   

8.

Objective

To ascertain if outpatients with moderate chronic kidney disease (CKD) had their condition documented in their notes in the electronic health record (EHR).

Design

Outpatients with CKD were selected based on a reduced estimated glomerular filtration rate and their notes extracted from the Columbia University data warehouse. Two lexical-based classification tools (classifier and word-counter) were developed to identify documentation of CKD in electronic notes.

Measurements

The tools categorized patients'' individual notes on the basis of the presence of CKD-related terms. Patients were categorized as appropriately documented if their notes contained reference to CKD when CKD was present.

Results

The sensitivities of the classifier and word-count methods were 95.4% and 99.8%, respectively. The specificity of both was 99.8%. Categorization of individual patients as appropriately documented was 96.9% accurate. Of 107 patients with manually verified moderate CKD, 32 (22%) lacked appropriate documentation. Patients whose CKD had not been appropriately documented were significantly less likely to be on renin-angiotensin system inhibitors or have urine protein quantified, and had the illness for half as long (15.1 vs 30.7 months; p<0.01) compared to patients with documentation.

Conclusion

Our studies show that lexical-based classification tools can accurately ascertain if appropriate documentation of CKD is present in a EHR. Using this method, we demonstrated under-documentation of patients with moderate CKD. Under-documented patients were less likely to receive CKD guideline recommended care. A tool that prompts providers to document CKD might shorten the time to implementing guideline-based recommendations.  相似文献   

9.

Introduction

Jaundice is the yellowish pigmentation of the skin, sclera, and mucous membranes resulting from bilirubin deposition. Children born to mothers with HIV are more likely to be born premature, with low birth weight, and to become septic—all risk factors for neonatal jaundice. Further, there has been a change in the prevention of mother-to-child transmission (PMTCT) of HIV guidelines from single-dose nevirapine to a six-week course, all of which theoretically put HIV-exposed newborns at greater risk of developing neonatal jaundice.

Aim

We carried out a study to determine the incidence of severe and clinical neonatal jaundice in HIV-exposed neonates admitted to the Chatinkha Nursery (CN) neonatal unit at Queen Elizabeth Central Hospital (QECH) in Blantyre.

Methods

Over a period of four weeks, the incidence among non-exposed neonates was also determined for comparison between the two groups of infants. Clinical jaundice was defined as transcutaneous bilirubin levels greater than 5 mg/dL and severe jaundice as bilirubin levels above the age-specific treatment threshold according the QECH guidelines. Case notes of babies admitted were retrieved and information on birth date, gestational age, birth weight, HIV status of mother, type of feeding, mode of delivery, VDRL status of mother, serum bilirubin, duration of stay in CN, and outcome were extracted.

Results

Of the 149 neonates who were recruited, 17 (11.4%) were HIV-exposed. One (5.88%) of the 17 HIV-exposed and 19 (14.4%) of 132 HIV-non-exposed infants developed severe jaundice requiring therapeutic intervention (p = 0.378). Eight (47%) of the HIV-exposed and 107 (81%) of the non-exposed neonates had clinical jaundice of bilirubin levels greater than 5 mg/dL (p < 0.001).

Conclusions

The study showed a significant difference in the incidence of clinical jaundice between the HIV-exposed and HIV-non-exposed neonates. Contrary to our hypothesis, however, the incidence was greater in HIV-non-exposed than in HIV-exposed infants.  相似文献   

10.

Objective

In this study the authors describe the system submitted by the team of University of Szeged to the second i2b2 Challenge in Natural Language Processing for Clinical Data. The challenge focused on the development of automatic systems that analyzed clinical discharge summary texts and addressed the following question: “Who''s obese and what co-morbidities do they (definitely/most likely) have?”. Target diseases included obesity and its 15 most frequent comorbidities exhibited by patients, while the target labels corresponded to expert judgments based on textual evidence and intuition (separately).

Design

The authors applied statistical methods to preselect the most common and confident terms and evaluated outlier documents by hand to discover infrequent spelling variants. The authors expected a system with dictionaries gathered semi-automatically to have a good performance with moderate development costs (the authors examined just a small proportion of the records manually).

Measurements

Following the standard evaluation method of the second Workshop on challenges in Natural Language Processing for Clinical Data, the authors used both macro- and microaveraged Fβ=1 measure for evaluation.

Results

The authors submission achieved a microaverage Fβ=1 score of 97.29% for classification based on textual evidence (macroaverage Fβ=1 = 76.22%) and 96.42% for intuitive judgments (macroaverage Fβ=1 = 67.27%).

Conclusions

The results demonstrate the feasibility of the authors approach and show that even very simple systems with a shallow linguistic analysis can achieve remarkable accuracy scores for classifying clinical records on a limited set of concepts.  相似文献   

11.

Objectives

To evaluate factors affecting performance of influenza detection, including accuracy of natural language processing (NLP), discriminative ability of Bayesian network (BN) classifiers, and feature selection.

Methods

We derived a testing dataset of 124 influenza patients and 87 non-influenza (shigellosis) patients. To assess NLP finding-extraction performance, we measured the overall accuracy, recall, and precision of Topaz and MedLEE parsers for 31 influenza-related findings against a reference standard established by three physician reviewers. To elucidate the relative contribution of NLP and BN classifier to classification performance, we compared the discriminative ability of nine combinations of finding-extraction methods (expert, Topaz, and MedLEE) and classifiers (one human-parameterized BN and two machine-parameterized BNs). To assess the effects of feature selection, we conducted secondary analyses of discriminative ability using the most influential findings defined by their likelihood ratios.

Results

The overall accuracy of Topaz was significantly better than MedLEE (with post-processing) (0.78 vs 0.71, p<0.0001). Classifiers using human-annotated findings were superior to classifiers using Topaz/MedLEE-extracted findings (average area under the receiver operating characteristic (AUROC): 0.75 vs 0.68, p=0.0113), and machine-parameterized classifiers were superior to the human-parameterized classifier (average AUROC: 0.73 vs 0.66, p=0.0059). The classifiers using the 17 ‘most influential’ findings were more accurate than classifiers using all 31 subject-matter expert-identified findings (average AUROC: 0.76>0.70, p<0.05).

Conclusions

Using a three-component evaluation method we demonstrated how one could elucidate the relative contributions of components under an integrated framework. To improve classification performance, this study encourages researchers to improve NLP accuracy, use a machine-parameterized classifier, and apply feature selection methods.  相似文献   

12.
13.

Objective

To develop, evaluate, and share: (1) syntactic parsing guidelines for clinical text, with a new approach to handling ill-formed sentences; and (2) a clinical Treebank annotated according to the guidelines. To document the process and findings for readers with similar interest.

Methods

Using random samples from a shared natural language processing challenge dataset, we developed a handbook of domain-customized syntactic parsing guidelines based on iterative annotation and adjudication between two institutions. Special considerations were incorporated into the guidelines for handling ill-formed sentences, which are common in clinical text. Intra- and inter-annotator agreement rates were used to evaluate consistency in following the guidelines. Quantitative and qualitative properties of the annotated Treebank, as well as its use to retrain a statistical parser, were reported.

Results

A supplement to the Penn Treebank II guidelines was developed for annotating clinical sentences. After three iterations of annotation and adjudication on 450 sentences, the annotators reached an F-measure agreement rate of 0.930 (while intra-annotator rate was 0.948) on a final independent set. A total of 1100 sentences from progress notes were annotated that demonstrated domain-specific linguistic features. A statistical parser retrained with combined general English (mainly news text) annotations and our annotations achieved an accuracy of 0.811 (higher than models trained purely with either general or clinical sentences alone). Both the guidelines and syntactic annotations are made available at https://sourceforge.net/projects/medicaltreebank.

Conclusions

We developed guidelines for parsing clinical text and annotated a corpus accordingly. The high intra- and inter-annotator agreement rates showed decent consistency in following the guidelines. The corpus was shown to be useful in retraining a statistical parser that achieved moderate accuracy.  相似文献   

14.

Objective

To elucidate the pattern of inheritance and determine the relative magnitude of various genetic effects for maturity and flowering attributes in subtropical maize.

Methods

Four white grain maize inbred lines from flint group of corn, two with late maturity and two with early maturity, were used. These contrasting inbred lines were crossed to form four crosses. Six generations (P1, P2, F1, F2, BC1, and BC2) were developed for each individual cross. These were evaluated in triplicate trial for two consecutive years.

Results

Both dominance gene action and epistatic interaction played major role in governing inheritance of days to pollen shedding, 50% silking, anthesis silking interval and maturity.

Conclusions

Preponderance of dominance gene action for these traits indicated their usefulness in hybrid programs of subtropical maize.  相似文献   

15.

Introduction

Acute pancreatitis (AP) is a common illness with varied mortality and morbidity. Patients with AP complicated with acute renal failure (ARF) have higher mortality than patients with AP alone. Although ARF has been proposed as a leading mortality cause for AP patients admitted to the ICU, few studies have directly analyzed the relationship between AP and ARF.

Methods

We performed a retrospective study using the population-based database from the Taiwan National Health Insurance Research Database (NHIRD). In the period from 1 January 2005 to 31 December 2005, every patient with AP admitted to the ICU was included and assessed for the presence of ARF and mortality risk.

Results

In year 2005, there were a total of 221,101 admissions to the ICU. There were 1,734 patients with AP, of which 261 (15.05%) patients also had a diagnosis of ARF. Compared to sepsis and other critical illness, patients with AP had a higher risk of having a diagnosis of ARF, and patients with both diagnoses had a higher mortality rate in the same ICU hospitalization.

Conclusion

AP is associated with a higher risk of ARF, and, when both conditions exist, a higher risk of mortality is present.  相似文献   

16.

Objective

Applying the science of networks to quantify the discriminatory impact of the ICD-9-CM to ICD-10-CM transition between clinical specialties.

Materials and Methods

Datasets were the Center for Medicaid and Medicare Services ICD-9-CM to ICD-10-CM mapping files, general equivalence mappings, and statewide Medicaid emergency department billing. Diagnoses were represented as nodes and their mappings as directional relationships. The complex network was synthesized as an aggregate of simpler motifs and tabulation per clinical specialty.

Results

We identified five mapping motif categories: identity, class-to-subclass, subclass-to-class, convoluted, and no mapping. Convoluted mappings indicate that multiple ICD-9-CM and ICD-10-CM codes share complex, entangled, and non-reciprocal mappings. The proportions of convoluted diagnoses mappings (36% overall) range from 5% (hematology) to 60% (obstetrics and injuries). In a case study of 24 008 patient visits in 217 emergency departments, 27% of the costs are associated with convoluted diagnoses, with ‘abdominal pain’ and ‘gastroenteritis’ accounting for approximately 3.5%.

Discussion

Previous qualitative studies report that administrators and clinicians are likely to be challenged in understanding and managing their practice because of the ICD-10-CM transition. We substantiate the complexity of this transition with a thorough quantitative summary per clinical specialty, a case study, and the tools to apply this methodology easily to any clinical practice in the form of a web portal and analytic tables.

Conclusions

Post-transition, successful management of frequent diseases with convoluted mapping network patterns is critical. The http://lussierlab.org/transition-to-ICD10CM web portal provides insight in linking onerous diseases to the ICD-10 transition.  相似文献   

17.

Objective

To develop a system to extract follow-up information from radiology reports. The method may be used as a component in a system which automatically generates follow-up information in a timely fashion.

Methods

A novel method of combining an LSP (labeled sequential pattern) classifier with a CRF (conditional random field) recognizer was devised. The LSP classifier filters out irrelevant sentences, while the CRF recognizer extracts follow-up and time phrases from candidate sentences presented by the LSP classifier.

Measurements

The standard performance metrics of precision (P), recall (R), and F measure (F) in the exact and inexact matching settings were used for evaluation.

Results

Four experiments conducted using 20 000 radiology reports showed that the CRF recognizer achieved high performance without time-consuming feature engineering and that the LSP classifier further improved the performance of the CRF recognizer. The performance of the current system is P=0.90, R=0.86, F=0.88 in the exact matching setting and P=0.98, R=0.93, F=0.95 in the inexact matching setting.

Conclusion

The experiments demonstrate that the system performs far better than a baseline rule-based system and is worth considering for deployment trials in an alert generation system. The LSP classifier successfully compensated for the inherent weakness of CRF, that is, its inability to use global information.  相似文献   

18.

Objective

To compare the pattern of jaundice resolution among children with severe malaria treated with quinine and artemether.

Methods

Thirty two children who fulfilled the inclusion criteria were recruited for the study from two hospitals with intensive care facilities. They were divided into two groups; ‘Q’ and ‘A’, receiving quinine and artemether, respectively. Jaundice was assessed by clinical examination.

Results

Sixteen out of 32 children recruited (representing 50%) presented with jaundice on the day of recruitment. The mean age was (7.00°C2.56) years. On day 3, four patients in ‘A’ and six patients in ‘Q’ had jaundice. By day 7, no child had jaundice.

Conclusion

The study has shown that both drugs resolve jaundice although artemether relatively resolves it faster by the third day.  相似文献   

19.

Objective

This paper presents an automated system for classifying the results of imaging examinations (CT, MRI, positron emission tomography) into reportable and non-reportable cancer cases. This system is part of an industrial-strength processing pipeline built to extract content from radiology reports for use in the Victorian Cancer Registry.

Materials and methods

In addition to traditional supervised learning methods such as conditional random fields and support vector machines, active learning (AL) approaches were investigated to optimize training production and further improve classification performance. The project involved two pilot sites in Victoria, Australia (Lake Imaging (Ballarat) and Peter MacCallum Cancer Centre (Melbourne)) and, in collaboration with the NSW Central Registry, one pilot site at Westmead Hospital (Sydney).

Results

The reportability classifier performance achieved 98.25% sensitivity and 96.14% specificity on the cancer registry''s held-out test set. Up to 92% of training data needed for supervised machine learning can be saved by AL.

Discussion

AL is a promising method for optimizing the supervised training production used in classification of radiology reports. When an AL strategy is applied during the data selection process, the cost of manual classification can be reduced significantly.

Conclusions

The most important practical application of the reportability classifier is that it can dramatically reduce human effort in identifying relevant reports from the large imaging pool for further investigation of cancer. The classifier is built on a large real-world dataset and can achieve high performance in filtering relevant reports to support cancer registries.  相似文献   

20.

Objective

Although electronic notes have advantages compared to handwritten notes, they take longer to write and promote information redundancy in electronic health records (EHRs). We sought to quantify redundancy in clinical documentation by studying collections of physician notes in an EHR.

Design and methods

We implemented a retrospective design to gather all electronic admission, progress, resident signout and discharge summary notes written during 100 randomly selected patient admissions within a 6 month period. We modified and applied a Levenshtein edit-distance algorithm to align and compare the documents written for each of the 100 admissions. We then identified and measured the amount of text duplicated from previous notes. Finally, we manually reviewed the content that was conserved between note types in a subsample of notes.

Measurements

We measured the amount of new information in a document, which was calculated as the number of words that did not match with previous documents divided by the length, in words, of the document. Results are reported as the percentage of information in a document that had been duplicated from previously written documents.

Results

Signout and progress notes proved to be particularly redundant, with an average of 78% and 54% information duplicated from previous documents respectively. There was also significant information duplication between document types (eg, from an admission note to a progress note).

Conclusion

The study established the feasibility of exploring redundancy in the narrative record with a known sequence alignment algorithm used frequently in the field of bioinformatics. The findings provide a foundation for studying the usefulness and risks of redundancy in the EHR.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号