共查询到20条相似文献,搜索用时 46 毫秒
1.
Objective
Automated and disease-specific classification of textual clinical discharge summaries is of great importance in human life science, as it helps physicians to make medical studies by providing statistically relevant data for analysis. This can be further facilitated if, at the labeling of discharge summaries, semantic labels are also extracted from text, such as whether a given disease is present, absent, questionable in a patient, or is unmentioned in the document. The authors present a classification technique that successfully solves the semantic classification task.Design
The authors introduce a context-aware rule-based semantic classification technique for use on clinical discharge summaries. The classification is performed in subsequent steps. First, some misleading parts are removed from the text; then the text is partitioned into positive, negative, and uncertain context segments, then a sequence of binary classifiers is applied to assign the appropriate semantic labels.Measurement
For evaluation the authors used the documents of the i2b2 Obesity Challenge and adopted its evaluation measures: F1-macro and F1-micro for measurements.Results
On the two subtasks of the Obesity Challenge (textual and intuitive classification) the system performed very well, and achieved a F1-macro = 0.80 for the textual and F1-macro = 0.67 for the intuitive tasks, and obtained second place at the textual and first place at the intuitive subtasks of the challenge.Conclusions
The authors show in the paper that a simple rule-based classifier can tackle the semantic classification task more successfully than machine learning techniques, if the training data are limited and some semantic labels are very sparse. 相似文献2.
Objective
Evaluate the effectiveness of a simple rule-based approach in classifying medical discharge summaries according to indicators for obesity and 15 associated co-morbidities as part of the 2008 i2b2 Obesity Challenge.Methods
The authors applied a rule-based approach that looked for occurrences of morbidity-related keywords and identified the types of assertions in which those keywords occurred. The documents were then classified using a simple scoring algorithm based on a mapping of the assertion types to possible judgment categories.Measurements
Results for the challenge were evaluated based on macro F-measure. We report micro and macro F-measure results for all morbidities combined and for each morbidity separately.Results
Our rule-based approach achieved micro and macro F-measures of 0.97 and 0.77, respectively, ranking fifth out of the entries submitted by 28 teams participating in the classification task based on textual judgments and substantially outperforming the average for the challenge.Conclusions
As shown by its ranking in the challenge results, this approach performed relatively well under conditions in which limited training data existed for some judgment categories. Further, the approach held up well in relation to more complex approaches applied to this classification task. The approach could be enhanced by the addition of expert rules to model more complex medical reasoning. 相似文献3.
George Hripcsak Noémie Elhadad PhD Yueh-Hsia Chen MS Li Zhou BMed PhD Frances P. Morrison MD MPH 《J Am Med Inform Assoc》2009,16(2):220
Objective
To measure the uncertainty of temporal assertions like “3 weeks ago” in clinical texts.Design
Temporal assertions extracted from narrative clinical reports were compared to facts extracted from a structured clinical database for the same patients.Measurements
The authors correlated the assertions and the facts to determine the dependence of the uncertainty of the assertions on the semantic and lexical properties of the assertions.Results
The observed deviation between the stated duration and actual duration averaged about 20% of the stated deviation. Linear regression revealed that assertions about events further in the past tend to be more uncertain, smaller numeric values tend to be more uncertain (1 mo v. 30 d), and round numbers tend to be more uncertain (10 versus 11 yrs).Conclusions
The authors empirically derived semantics behind statements of duration using “ago,” and verified intuitions about how numbers are used. 相似文献4.
Objective
Free-text clinical reports serve as an important part of patient care management and clinical documentation of patient disease and treatment status. Free-text notes are commonplace in medical practice, but remain an under-used source of information for clinical and epidemiological research, as well as personalized medicine. The authors explore the challenges associated with automatically extracting information from clinical reports using their submission to the Integrating Informatics with Biology and the Bedside (i2b2) 2008 Natural Language Processing Obesity Challenge Task.Design
A text mining system for classifying patient comorbidity status, based on the information contained in clinical reports. The approach of the authors incorporates a variety of automated techniques, including hot-spot filtering, negated concept identification, zero-vector filtering, weighting by inverse class-frequency, and error-correcting of output codes with linear support vector machines.Measurements
Performance was evaluated in terms of the macroaveraged F1 measure.Results
The automated system performed well against manual expert rule-based systems, finishing fifth in the Challenge's intuitive task, and 13th in the textual task.Conclusions
The system demonstrates that effective comorbidity status classification by an automated system is possible. 相似文献5.
Hui Yang 《J Am Med Inform Assoc》2009,16(4):596-600
Objective
The authors present a system developed for the Challenge in Natural Language Processing for Clinical Data—the i2b2 obesity challenge, whose aim was to automatically identify the status of obesity and 15 related co-morbidities in patients using their clinical discharge summaries. The challenge consisted of two tasks, textual and intuitive. The textual task was to identify explicit references to the diseases, whereas the intuitive task focused on the prediction of the disease status when the evidence was not explicitly asserted.Design
The authors assembled a set of resources to lexically and semantically profile the diseases and their associated symptoms, treatments, etc. These features were explored in a hybrid text mining approach, which combined dictionary look-up, rule-based, and machine-learning methods.Measurements
The methods were applied on a set of 507 previously unseen discharge summaries, and the predictions were evaluated against a manually prepared gold standard. The overall ranking of the participating teams was primarily based on the macro-averaged F-measure.Results
The implemented method achieved the macro-averaged F-measure of 81% for the textual task (which was the highest achieved in the challenge) and 63% for the intuitive task (ranked 7th out of 28 teams—the highest was 66%). The micro-averaged F-measure showed an average accuracy of 97% for textual and 96% for intuitive annotations.Conclusions
The performance achieved was in line with the agreement between human annotators, indicating the potential of text mining for accurate and efficient prediction of disease statuses from clinical discharge summaries. 相似文献6.
Steven Shea Ruth S. Weinstock Jeanne A. Teresi Walter Palmas Justin Starren James J. Cimino Albert M. Lai Lesley Field Philip C. Morin Robin Goland Roberto E. Izquierdo Susana Ebner Stephanie Silver Eva Petkova Jian Kong Joseph P. Eimicke IDEATel Consortium 《J Am Med Inform Assoc》2009,16(4):446-456
Context
Telemedicine is a promising but largely unproven technology for providing case management services to patients with chronic conditions and lower access to care.Objectives
To examine the effectiveness of a telemedicine intervention to achieve clinical management goals in older, ethnically diverse, medically underserved patients with diabetes.Design, Setting, and Patients
A randomized controlled trial was conducted, comparing telemedicine case management to usual care, with blinded outcome evaluation, in 1,665 Medicare recipients with diabetes, aged ≥ 55 years, residing in federally designated medically underserved areas of New York State.Interventions
Home telemedicine unit with nurse case management versus usual care.Main Outcome Measures
The primary endpoints assessed over 5 years of follow-up were hemoglobin A1c (HgbA1c), low density lipoprotein (LDL) cholesterol, and blood pressure levels.Results
Intention-to-treat mixed models showed that telemedicine achieved net overall reductions over five years of follow-up in the primary endpoints (HgbA1c, p = 0.001; LDL, p < 0.001; systolic and diastolic blood pressure, p = 0.024; p < 0.001). Estimated differences (95% CI) in year 5 were 0.29 (0.12, 0.46)% for HgbA1c, 3.84 (−0.08, 7.77) mg/dL for LDL cholesterol, and 4.32 (1.93, 6.72) mm Hg for systolic and 2.64 (1.53, 3.74) mm Hg for diastolic blood pressure. There were 176 deaths in the intervention group and 169 in the usual care group (hazard ratio 1.01 [0.82, 1.24]).Conclusions
Telemedicine case management resulted in net improvements in HgbA1c, LDL-cholesterol and blood pressure levels over 5 years in medically underserved Medicare beneficiaries. Mortality was not different between the groups, although power was limited.Trial Registration
http://clinicaltrials.gov Identifier: NCT00271739. 相似文献7.
8.
Objective
The purpose of this study is to reassess the projected rate of Electronic Health Record (EHR) diffusion and examine how the federal government's efforts to promote the use of EHR technology have influenced physicians' willingness to adopt such systems. The study recreates and extends the analyses conducted by Ford et al.1 The two periods examined come before and after the U.S. Federal Government's concerted activity to promote EHR adoption.Design
Meta-analysis and bass modeling are used to compare EHR diffusion rates for two distinct periods of government activity. Very low levels of government activity to promote EHR diffusion marked the first period, before 2004. In 2004, the President of the United States called for a “Universal EHR Adoption” by 2014 (10 yrs), creating the major wave of activity and increased awareness of how EHRs will impact physicians' practices.Measurement
EHR adoption parameters—external and internal coefficients of influence—are estimated using bass diffusion models and future adoption rates are projected.Results
Comparing the EHR adoption rates before and after 2004 (2001-2004 and 2001-2007 respectively) indicate the physicians' resistance to adoption has increased during the second period. Based on current levels of adoption, less than half the physicians working in small practices will have implemented an EHR by 2014 (47.3%).Conclusions
The external forces driving EHR diffusion have grown in importance since 2004 relative to physicians' internal motivation to adopt such systems. Several national forces are likely contributing to the slowing pace of EHR diffusion. 相似文献9.
Objective
To use the semantic and structural properties in the Unified Medical Language System (UMLS) Metathesaurus to characterize and discover potential relationships.Design
The UMLS integrates knowledge from several biomedical terminologies. This knowledge can be used to discover implicit semantic relationships between concepts. In this paper, the authors propose a problem-independent approach for discovering potential terminological relationships that employs semantic abstraction of indirect relationship paths to perform classification and analysis of network theoretical measures such as topological overlap, preferential attachment, graph partitioning, and number of indirect paths. Using different versions of the UMLS, the authors evaluate the proposed approach's ability to predict newly added relationships.Measurements
Classification accuracy, precision-recall.Results
Strong discriminative characteristics were observed with a semantic abstraction based classifier (classification accuracy of 91%), the average number of indirect paths, preferential attachment, and graph partitioning to identify potential relationships. The proposed relationship prediction algorithm resulted in 56% recall in top 10 results for new relationships added to subsequent versions of the UMLS between 2005 and 2007.Conclusions
The UMLS has sufficient knowledge to enable discovery of potential terminological relationships. 相似文献10.
James B. Weaver III Darren Mays Gregg Lindner Do?an Ero?lu Frederick Fridinger Jay M. Bernhardt 《J Am Med Inform Assoc》2009,16(5):714-722
Objective
The Internet's potential to bolster health promotion and disease prevention efforts has attracted considerable attention. Existing research leaves two things unclear, however: the prevalence of online health and medical information seeking and the distinguishing characteristics of individuals who seek that information.Design
This study seeks to clarify and extend the knowledge base concerning health and medical information use online by profiling adults using Internet medical information (IMI). Secondary analysis of survey data from a large sample (n = 6,119) representative of the Atlanta, GA, area informed this investigation.Measurements
Five survey questions were used to assess IMI use and general computer and Internet use during the 30 days before the survey was administered. Five questions were also used to assess respondents' health care system use. Several demographic characteristics were measured.Results
Contrary to most prior research, this study found relatively low prevalence of IMI-seeking behavior. Specifically, IMI use was reported by 13.2% of all respondents (n = 6,119) and by 21.1% of respondents with Internet access (n = 3,829). Logistic regression models conducted among respondents accessing the Internet in the previous 30 days revealed that, when controlling for several sociodemographic characteristics, home computer ownership, online time per week, and health care system use are all positively linked with IMI-seeking behavior.Conclusions
The data suggest it may be premature to embrace unilaterally the Internet as an effective asset for health promotion and disease prevention efforts that target the public. 相似文献11.
Vivienne J. Zhu Marc J. Overhage James Egg Shaun J. Grannis 《J Am Med Inform Assoc》2009,16(5):738-745
Objective
To incorporate value-based weight scaling into the Fellegi-Sunter (F-S) maximum likelihood linkage algorithm and evaluate the performance of the modified algorithm.Background
Because healthcare data are fragmented across many healthcare systems, record linkage is a key component of fully functional health information exchanges. Probabilistic linkage methods produce more accurate, dynamic, and robust matching results than rule-based approaches, particularly when matching patient records that lack unique identifiers. Theoretically, the relative frequency of specific data elements can enhance the F-S method, including minimizing the false-positive or false-negative matches. However, to our knowledge, no frequency-based weight scaling modification to the F-S method has been implemented and specifically evaluated using real-world clinical data.Methods
The authors implemented a value-based weight scaling modification using an information theoretical model, and formally evaluated the effectiveness of this modification by linking 51,361 records from Indiana statewide newborn screening data to 80,089 HL7 registration messages from the Indiana Network for Patient Care, an operational health information exchange. In addition to applying the weight scaling modification to all fields, we examined the effect of selectively scaling common or uncommon field-specific values.Results
The sensitivity, specificity, and positive predictive value for applying weight scaling to all field-specific values were 95.4, 98.8, and 99.9%, respectively. Compared with nonweight scaling, the modified F-S algorithm demonstrated a 10% increase in specificity with a 3% decrease in sensitivity.Conclusion
By eliminating false-positive matches, the value-based weight modification can enhance the specificity of the F-S method with minimal decrease in sensitivity. 相似文献12.
Objective:We have explored the role of nuclear factor kappa B(NF-κB) in the pathogenesis of chronic glomerulonephritis,and investigated the effect of rhododendron root on the activation of NF-κB.Methods:Thirty-six Wistar rats were randomly divided into three groups:a control group,a glomerulonephritis model group and a therapy group(glomerulouephritis animals treated with the root of rhododendron).Bovine serum albumin(BSA) nephritis was induced by subcutaneous immunization and daily intraperitoneal administra-tion of BSA.Twenty-four-hour urinary protein and serum creatinine values were measured,and renal pathology was assessed histologi-cally by optical microscopy and electron microscopy.NF-κB activity was determined by an electrophoretic mobility shift assay(EMSA).Results:Compared with the control rats,glomerulonephrids model rats exhibited a significant increase in both 24 h urinary protein and serum creatinine,and had abnormal renal histology.The administration of the root of rhododendron ameliorated these changes.NF-κ B activity in glomerulonephritis model group was greater than that in rhododendron-treated group,and NF-κB activity was greater in both glomerulonephritis groups than in the control group(P<0.01).Conclusion:These observations suggest that NF-κ B plays a role in the pathogenesis of chronic glomerulonephritis,and rhododendron root may attenuate renal damages by downregulating the activation of NF-kB in this model. 相似文献
13.
Joshua C. Denny Anderson Spickard III Kevin B. Johnson Neeraja B. Peterson Josh F. Peterson Randolph A. Miller 《J Am Med Inform Assoc》2009,16(6):806-815
Objective
Clinical notes, typically written in natural language, often contain substructure that divides them into sections, such as “History of Present Illness” or “Family Medical History.” The authors designed and evaluated an algorithm (“SecTag”) to identify both labeled and unlabeled (implied) note section headers in “history and physical examination” documents (“H&P notes”).Design
The SecTag algorithm uses a combination of natural language processing techniques, word variant recognition with spelling correction, terminology-based rules, and naive Bayesian scoring methods to identify note section headers. Eleven physicians evaluated SecTag's performance on 319 randomly chosen H&P notes.Measurements
The primary outcomes were the algorithm's recall and precision in identifying all document sections and a predefined list of twenty-nine major sections. A secondary outcome was to evaluate the algorithm's ability to recognize the correct start and end boundaries of identified sections.Results
The SecTag algorithm identified 16,036 total sections and 7,858 major sections. Physician evaluators classified 15,329 as true positives and identified 160 sections omitted by SecTag. The recall and precision of the SecTag algorithm were 99.0 and 95.6% for all sections, 98.6 and 96.2% for major sections, and 96.6 and 86.8% for unlabeled sections. The algorithm determined the correct starting and ending text boundaries for 94.8% of labeled sections and 85.9% of unlabeled sections.Conclusions
The SecTag algorithm accurately identified both labeled and unlabeled sections in history and physical documents. This type of algorithm may assist in natural language processing applications, such as clinical decision support systems or competency assessment for medical trainees. 相似文献14.
Khaled El Emam Fida Kamal Dankar Romeo Issa Elizabeth Jonker Daniel Amyot Elise Cogo Jean-Pierre Corriveau Mark Walker Sadrul Chowdhury Regis Vaillancourt Tyson Roffey Jim Bottomley 《J Am Med Inform Assoc》2009,16(5):670-682
Background
Explicit patient consent requirements in privacy laws can have a negative impact on health research, leading to selection bias and reduced recruitment. Often legislative requirements to obtain consent are waived if the information collected or disclosed is de-identified.Objective
The authors developed and empirically evaluated a new globally optimal de-identification algorithm that satisfies the k-anonymity criterion and that is suitable for health datasets.Design
Authors compared OLA (Optimal Lattice Anonymization) empirically to three existing k-anonymity algorithms, Datafly, Samarati, and Incognito, on six public, hospital, and registry datasets for different values of k and suppression limits.Measurement
Three information loss metrics were used for the comparison: precision, discernability metric, and non-uniform entropy. Each algorithm's performance speed was also evaluated.Results
The Datafly and Samarati algorithms had higher information loss than OLA and Incognito; OLA was consistently faster than Incognito in finding the globally optimal de-identification solution.Conclusions
For the de-identification of health datasets, OLA is an improvement on existing k-anonymity algorithms in terms of information loss and performance. 相似文献15.
Context
The healthcare industry could achieve significant benefits through the adoption of a service-oriented architecture (SOA). The specification and adoption of standard software service interfaces will be critical to achieving these benefits.Objective
To develop a replicable, collaborative framework for standardizing the interfaces of software services important to healthcare.Design
Iterative, peer-reviewed development of a framework for generating interoperable service specifications that build on existing and ongoing standardization efforts. The framework was created under the auspices of the Healthcare Services Specification Project (HSSP), which was initiated in 2005 as a joint initiative between Health Level7 (HL7) and the Object Management Group (OMG). In this framework, known as the HSSP Service Specification Framework, HL7 identifies candidates for service standardization and defines normative Service Functional Models (SFMs) that specify the capabilities and conformance criteria for these services. OMG then uses these SFMs to generate technical service specifications as well as reference implementations.Measurements
The ability of the framework to support the creation of multiple, interoperable service specifications useful for healthcare.Results
Functional specifications have been defined through HL7 for four services: the Decision Support Service; the Entity Identification Service; the Clinical Research Filtered Query Service; and the Retrieve, Locate, and Update Service. Technical specifications and commercial implementations have been developed for two of these services within OMG. Furthermore, three additional functional specifications are being developed through HL7.Conclusions
The HSSP Service Specification Framework provides a replicable and collaborative approach to defining standardized service specifications for healthcare. 相似文献16.
17.
The 2010 i2b2/VA Workshop on Natural Language Processing Challenges for Clinical Records presented three tasks: a concept extraction task focused on the extraction of medical concepts from patient reports; an assertion classification task focused on assigning assertion types for medical problem concepts; and a relation classification task focused on assigning relation types that hold between medical problems, tests, and treatments. i2b2 and the VA provided an annotated reference standard corpus for the three tasks. Using this reference standard, 22 systems were developed for concept extraction, 21 for assertion classification, and 16 for relation classification.These systems showed that machine learning approaches could be augmented with rule-based systems to determine concepts, assertions, and relations. Depending on the task, the rule-based systems can either provide input for machine learning or post-process the output of machine learning. Ensembles of classifiers, information from unlabeled data, and external knowledge sources can help when the training data are inadequate. 相似文献
18.
Han Y Li Y Song J Wang Y Shi Q Chen C Zhang B Guo Y Li C Han J Dong X 《Biomedical and environmental sciences : BES》2011,24(5):523-529
Objective To break immune tolerance to prion (PrP) proteins using DNA vaccines.Methods Four different human prion DNA vaccine candidates were constructed based on the pcDNA3.1 vector:PrP‐WT expressing wild‐type PrP,Ubiq‐PrP expressing PrP fused to ubiquitin,PrP‐LII expressing PrP fused to the lysosomal integral membrane protein type II lysosome‐targeting signal,and PrP‐ER expressing PrP locating the ER.Using a prime‐boost strategy,three‐doses of DNA vaccine were injected intramuscularly into Balb/c mice,fol... 相似文献
19.
Objective
A system that translates narrative text in the medical domain into structured representation is in great demand. The system performs three sub-tasks: concept extraction, assertion classification, and relation identification.Design
The overall system consists of five steps: (1) pre-processing sentences, (2) marking noun phrases (NPs) and adjective phrases (APs), (3) extracting concepts that use a dosage-unit dictionary to dynamically switch two models based on Conditional Random Fields (CRF), (4) classifying assertions based on voting of five classifiers, and (5) identifying relations using normalized sentences with a set of effective discriminating features.Measurements
Macro-averaged and micro-averaged precision, recall and F-measure were used to evaluate results.Results
The performance is competitive with the state-of-the-art systems with micro-averaged F-measure of 0.8489 for concept extraction, 0.9392 for assertion classification and 0.7326 for relation identification.Conclusions
The system exploits an array of common features and achieves state-of-the-art performance. Prudent feature engineering sets the foundation of our systems. In concept extraction, we demonstrated that switching models, one of which is especially designed for telegraphic sentences, improved extraction of the treatment concept significantly. In assertion classification, a set of features derived from a rule-based classifier were proven to be effective for the classes such as conditional and possible. These classes would suffer from data scarcity in conventional machine-learning methods. In relation identification, we use two-staged architecture, the second of which applies pairwise classifiers to possible candidate classes. This architecture significantly improves performance. 相似文献20.