首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.

Objective

We explored automated concept-based indexing of unstructured figure captions to improve retrieval of images from radiology journals.

Design

The MetaMap Transfer program (MMTx) was used to map the text of 84,846 figure captions from 9,004 peer-reviewed, English-language articles to concepts in three controlled vocabularies from the UMLS Metathesaurus, version 2006AA. Sampling procedures were used to estimate the standard information-retrieval metrics of precision and recall, and to evaluate the degree to which concept-based retrieval improved image retrieval.

Measurements

Precision was estimated based on a sample of 250 concepts. Recall was estimated based on a sample of 40 concepts. The authors measured the impact of concept-based retrieval to improve upon keyword-based retrieval in a random sample of 10,000 search queries issued by users of a radiology image search engine.

Results

Estimated precision was 0.897 (95% confidence interval, 0.857-0.937). Estimated recall was 0.930 (95% confidence interval, 0.838-1.000). In 5,535 of 10,000 search queries (55%), concept-based retrieval found results not identified by simple keyword matching; in 2,086 searches (21%), more than 75% of the results were found by concept-based search alone.

Conclusion

Concept-based indexing of radiology journal figure captions achieved very high precision and recall, and significantly improved image retrieval.  相似文献   

2.

Objective

To characterize PubMed usage over a typical day and compare it to previous studies of user behavior on Web search engines.

Design

We performed a lexical and semantic analysis of 2,689,166 queries issued on PubMed over 24 consecutive hours on a typical day.

Measurements

We measured the number of queries, number of distinct users, queries per user, terms per query, common terms, Boolean operator use, common phrases, result set size, MeSH categories, used semantic measurements to group queries into sessions, and studied the addition and removal of terms from consecutive queries to gauge search strategies.

Results

The size of the result sets from a sample of queries showed a bimodal distribution, with peaks at approximately 3 and 100 results, suggesting that a large group of queries was tightly focused and another was broad. Like Web search engine sessions, most PubMed sessions consisted of a single query. However, PubMed queries contained more terms.

Conclusion

PubMed’s usage profile should be considered when educating users, building user interfaces, and developing future biomedical information retrieval systems.  相似文献   

3.
4.

Objective

To develop a generalizable method for identifying patient cohorts from electronic health record (EHR) data—in this case, patients having dialysis—that uses simple information retrieval (IR) tools.

Methods

We used the coded data and clinical notes from the 24 506 adult patients in the Multiparameter Intelligent Monitoring in Intensive Care database to identify patients who had dialysis. We used SQL queries to search the procedure, diagnosis, and coded nursing observations tables based on ICD-9 and local codes. We used a domain-specific search engine to find clinical notes containing terms related to dialysis. We manually validated the available records for a 10% random sample of patients who potentially had dialysis and a random sample of 200 patients who were not identified as having dialysis based on any of the sources.

Results

We identified 1844 patients that potentially had dialysis: 1481 from the three coded sources and 1624 from the clinical notes. Precision for identifying dialysis patients based on available data was estimated to be 78.4% (95% CI 71.9% to 84.2%) and recall was 100% (95% CI 86% to 100%).

Conclusions

Combining structured EHR data with information from clinical notes using simple queries increases the utility of both types of data for cohort identification. Patients identified by more than one source are more likely to meet the inclusion criteria; however, including patients found in any of the sources increases recall. This method is attractive because it is available to researchers with access to EHR data and off-the-shelf IR tools.  相似文献   

5.

Background

This paper presents a novel approach to searching electronic medical records that is based on concept matching rather than keyword matching.

Aim

The concept-based approach is intended to overcome specific challenges we identified in searching medical records.

Method

Queries and documents were transformed from their term-based originals into medical concepts as defined by the SNOMED-CT ontology.

Results

Evaluation on a real-world collection of medical records showed our concept-based approach outperformed a keyword baseline by 25% in Mean Average Precision.

Conclusion

The concept-based approach provides a framework for further development of inference based search systems for dealing with medical data.  相似文献   

6.

Objectives

To develop mechanisms to formulate queries over the semantic representation of cancer-related data services available through the cancer Biomedical Informatics Grid (caBIG).

Design

The semCDI query formulation uses a view of caBIG semantic concepts, metadata, and data as an ontology, and defines a methodology to specify queries using the SPARQL query language, extended with Horn rules. semCDI enables the joining of data that represent different concepts through associations modeled as object properties, and the merging of data representing the same concept in different sources through Common Data Elements (CDE) modeled as datatype properties, using Horn rules to specify additional semantics indicating conditions for merging data.

Validation

In order to validate this formulation, a prototype has been constructed, and two queries have been executed against currently available caBIG data services.

Discussion

The semCDI query formulation uses the rich semantic metadata available in caBIG to build queries and integrate data from multiple sources. Its promise will be further enhanced as more data services are registered in caBIG, and as more linkages can be achieved between the knowledge contained within caBIG''s NCI Thesaurus and the data contained in the Data Services.

Conclusion

semCDI provides a formulation for the creation of queries on the semantic representation of caBIG. This constitutes the foundation to build a semantic data integration system for more efficient and effective querying and exploratory searching of cancer-related data.  相似文献   

7.

Objective

Understanding population-level health trends is essential to effectively monitor and improve public health. The Office of the National Coordinator for Health Information Technology (ONC) Query Health initiative is a collaboration to develop a national architecture for distributed, population-level health queries across diverse clinical systems with disparate data models. Here we review Query Health activities, including a standards-based methodology, an open-source reference implementation, and three pilot projects.

Materials and methods

Query Health defined a standards-based approach for distributed population health queries, using an ontology based on the Quality Data Model and Consolidated Clinical Document Architecture, Health Quality Measures Format (HQMF) as the query language, the Query Envelope as the secure transport layer, and the Quality Reporting Document Architecture as the result language.

Results

We implemented this approach using Informatics for Integrating Biology and the Bedside (i2b2) and hQuery for data analytics and PopMedNet for access control, secure query distribution, and response. We deployed the reference implementation at three pilot sites: two public health departments (New York City and Massachusetts) and one pilot designed to support Food and Drug Administration post-market safety surveillance activities. The pilots were successful, although improved cross-platform data normalization is needed.

Discussions

This initiative resulted in a standards-based methodology for population health queries, a reference implementation, and revision of the HQMF standard. It also informed future directions regarding interoperability and data access for ONC''s Data Access Framework initiative.

Conclusions

Query Health was a test of the learning health system that supplied a functional methodology and reference implementation for distributed population health queries that has been validated at three sites.  相似文献   

8.

Background

Due to the high cost of manual curation of key aspects from the scientific literature, automated methods for assisting this process are greatly desired. Here, we report a novel approach to facilitate MeSH indexing, a challenging task of assigning MeSH terms to MEDLINE citations for their archiving and retrieval.

Methods

Unlike previous methods for automatic MeSH term assignment, we reformulate the indexing task as a ranking problem such that relevant MeSH headings are ranked higher than those irrelevant ones. Specifically, for each document we retrieve 20 neighbor documents, obtain a list of MeSH main headings from neighbors, and rank the MeSH main headings using ListNet–a learning-to-rank algorithm. We trained our algorithm on 200 documents and tested on a previously used benchmark set of 200 documents and a larger dataset of 1000 documents.

Results

Tested on the benchmark dataset, our method achieved a precision of 0.390, recall of 0.712, and mean average precision (MAP) of 0.626. In comparison to the state of the art, we observe statistically significant improvements as large as 39% in MAP (p-value <0.001). Similar significant improvements were also obtained on the larger document set.

Conclusion

Experimental results show that our approach makes the most accurate MeSH predictions to date, which suggests its great potential in making a practical impact on MeSH indexing. Furthermore, as discussed the proposed learning framework is robust and can be adapted to many other similar tasks beyond MeSH indexing in the biomedical domain. All data sets are available at: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/indexing.  相似文献   

9.

Background

The constantly growing publication rate of medical research articles puts increasing pressure on medical specialists who need to be aware of the recent developments in their field. The currently used literature retrieval systems allow researchers to find specific papers; however the search task is still repetitive and time-consuming.

Aim

In this paper we describe a system that retrieves medical publications by automatically generating queries based on data from an electronic patient record. This allows the doctor to focus on medical issues and provide an improved service to the patient, with higher confidence that it is underpinned by current research.

Method

Our research prototype automatically generates query terms based on the patient record and adds weight factors for each term. Currently the patient’s age is taken into account with a fuzzy logic derived weight, and terms describing blood-related anomalies are derived from recent blood test results. Conditionally selected homonyms are used for query expansion.The query retrieves matching records from a local index of PubMed publications and displays results in descending relevance for the given patient. Recent publications are clearly highlighted for instant recognition by the researcher.

Results

Nine medical specialists from the Royal Adelaide Hospital evaluated the system and submitted pre-trial and post-trial questionnaires. Throughout the study we received positive feedback as doctors felt the support provided by the prototype was useful, and which they would like to use in their daily routine.

Conclusion

By supporting the time-consuming task of query formulation and iterative modification as well as by presenting the search results in order of relevance for the specific patient, literature retrieval becomes part of the daily workflow of busy professionals.  相似文献   

10.

Objective

Clinical Queries filters were developed to improve the retrieval of high-quality studies in searches on clinical matters. The study objective was to determine the yield of relevant citations and physician satisfaction while searching for diagnostic and treatment studies using the Clinical Queries page of PubMed compared with searching PubMed without these filters.

Materials and methods

Forty practicing physicians, presented with standardized treatment and diagnosis questions and one question of their choosing, entered search terms which were processed in a random, blinded fashion through PubMed alone and PubMed Clinical Queries. Participants rated search retrievals for applicability to the question at hand and satisfaction.

Results

For treatment, the primary outcome of retrieval of relevant articles was not significantly different between the groups, but a higher proportion of articles from the Clinical Queries searches met methodologic criteria (p=0.049), and more articles were published in core internal medicine journals (p=0.056). For diagnosis, the filtered results returned more relevant articles (p=0.031) and fewer irrelevant articles (overall retrieval less, p=0.023); participants needed to screen fewer articles before arriving at the first relevant citation (p<0.05). Relevance was also influenced by content terms used by participants in searching. Participants varied greatly in their search performance.

Discussion

Clinical Queries filtered searches returned more high-quality studies, though the retrieval of relevant articles was only statistically different between the groups for diagnosis questions.

Conclusion

Retrieving clinically important research studies from Medline is a challenging task for physicians. Methodological search filters can improve search retrieval.  相似文献   

11.

Objective

We explore relationships between health information seeking activities and engagement with healthcare professionals via a privacy-sensitive analysis of geo-tagged data from mobile devices.

Materials and methods

We analyze logs of mobile interaction data stripped of individually identifiable information and location data. The data analyzed consist of time-stamped search queries and distances to medical care centers. We examine search activity that precedes the observation of salient evidence of healthcare utilization (EHU) (ie, data suggesting that the searcher is using healthcare resources), in our case taken as queries occurring at or near medical facilities.

Results

We show that the time between symptom searches and observation of salient evidence of seeking healthcare utilization depends on the acuity of symptoms. We construct statistical models that make predictions of forthcoming EHU based on observations about the current search session, prior medical search activities, and prior EHU. The predictive accuracy of the models varies (65%–90%) depending on the features used and the timeframe of the analysis, which we explore via a sensitivity analysis.

Discussion

We provide a privacy-sensitive analysis that can be used to generate insights about the pursuit of health information and healthcare. The findings demonstrate how large-scale studies of mobile devices can provide insights on how concerns about symptomatology lead to the pursuit of professional care.

Conclusion

We present new methods for the analysis of mobile logs and describe a study that provides evidence about how people transition from mobile searches on symptoms and diseases to the pursuit of healthcare in the world.  相似文献   

12.

Objectives

Large databases of published medical research can support clinical decision making by providing physicians with the best available evidence. The time required to obtain optimal results from these databases using traditional systems often makes accessing the databases impractical for clinicians. This article explores whether a hybrid approach of augmenting traditional information retrieval with knowledge-based methods facilitates finding practical clinical advice in the research literature.

Design

Three experimental systems were evaluated for their ability to find MEDLINE citations providing answers to clinical questions of different complexity. The systems (SemRep, Essie, and CQA-1.0), which rely on domain knowledge and semantic processing to varying extents, were evaluated separately and in combination. Fifteen therapy and prevention questions in three categories (general, intermediate, and specific questions) were searched. The first 10 citations retrieved by each system were randomized, anonymized, and evaluated on a three-point scale. The reasons for ratings were documented.

Measurements

Metrics evaluating the overall performance of a system (mean average precision, binary preference) and metrics evaluating the number of relevant documents in the first several presented to a physician were used.

Results

Scores (mean average precision = 0.57, binary preference = 0.71) for fusion of the retrieval results of the three systems are significantly (p < 0.01) better than those for any individual system. All three systems present three to four relevant citations in the first five for any question type.

Conclusion

The improvements in finding relevant MEDLINE citations due to knowledge-based processing show promise in assisting physicians to answer questions in clinical practice.  相似文献   

13.

Objective

To evaluate: (1) the effectiveness of wireless handheld computers for online information retrieval in clinical settings; (2) the role of MEDLINE® in answering clinical questions raised at the point of care.

Design

A prospective single-cohort study: accompanying medical teams on teaching rounds, five internal medicine residents used and evaluated MD on Tap, an application for handheld computers, to seek answers in real time to clinical questions arising at the point of care.

Measurements

All transactions were stored by an intermediate server. Evaluators recorded clinical scenarios and questions, identified MEDLINE citations that answered the questions, and submitted daily and summative reports of their experience. A senior medical librarian corroborated the relevance of the selected citation to each scenario and question.

Results

Evaluators answered 68% of 363 background and foreground clinical questions during rounding sessions using a variety of MD on Tap features in an average session length of less than four minutes. The evaluator, the number and quality of query terms, the total number of citations found for a query, and the use of auto-spellcheck significantly contributed to the probability of query success.

Conclusion

Handheld computers with Internet access are useful tools for healthcare providers to access MEDLINE in real time. MEDLINE citations can answer specific clinical questions when several medical terms are used to form a query. The MD on Tap application is an effective interface to MEDLINE in clinical settings, allowing clinicians to quickly find relevant citations.  相似文献   

14.

Objective

To better understand the relationship between online health-seeking behaviors and in-world healthcare utilization (HU) by studies of online search and access activities before and after queries that pursue medical professionals and facilities.

Materials and methods

We analyzed data collected from logs of online searches gathered from consenting users of a browser toolbar from Microsoft (N=9740). We employed a complementary survey (N=489) to seek a deeper understanding of information-gathering, reflection, and action on the pursuit of professional healthcare.

Results

We provide insights about HU through the survey, breaking out its findings by different respondent marginalizations as appropriate. Observations made from search logs may be explained by trends observed in our survey responses, even though the user populations differ.

Discussion

The results provide insights about how users decide if and when to utilize healthcare resources, and how online health information seeking transitions to in-world HU. The findings from both the survey and the logs reveal behavioral patterns and suggest a strong relationship between search behavior and HU. Although the diversity of our survey respondents is limited and we cannot be certain that users visited medical facilities, we demonstrate that it may be possible to infer HU from long-term search behavior by the apparent influence that health concerns and professional advice have on search activity.

Conclusions

Our findings highlight different phases of online activities around queries pursuing professional healthcare facilities and services. We also show that it may be possible to infer HU from logs without tracking people''s physical location, based on the effect of HU on pre- and post-HU search behavior. This allows search providers and others to develop more robust models of interests and preferences by modeling utilization rather than simply the intention to utilize that is expressed in search queries.  相似文献   

15.

Objective

To describe a new medication information extraction system—Textractor—developed for the ‘i2b2 medication extraction challenge’. The development, functionalities, and official evaluation of the system are detailed.

Design

Textractor is based on the Apache Unstructured Information Management Architecture (UMIA) framework, and uses methods that are a hybrid between machine learning and pattern matching. Two modules in the system are based on machine learning algorithms, while other modules use regular expressions, rules, and dictionaries, and one module embeds MetaMap Transfer.

Measurements

The official evaluation was based on a reference standard of 251 discharge summaries annotated by all teams participating in the challenge. The metrics used were recall, precision, and the F1-measure. They were calculated with exact and inexact matches, and were averaged at the level of systems and documents.

Results

The reference metric for this challenge, the system-level overall F1-measure, reached about 77% for exact matches, with a recall of 72% and a precision of 83%. Performance was the best with route information (F1-measure about 86%), and was good for dosage and frequency information, with F1-measures of about 82–85%. Results were not as good for durations, with F1-measures of 36–39%, and for reasons, with F1-measures of 24–27%.

Conclusion

The official evaluation of Textractor for the i2b2 medication extraction challenge demonstrated satisfactory performance. This system was among the 10 best performing systems in this challenge.  相似文献   

16.

Background

Visual information is a crucial aspect of medical knowledge. Building a comprehensive medical image base, in the spirit of the Unified Medical Language System (UMLS), would greatly benefit patient education and self-care. However, collection and annotation of such a large-scale image base is challenging.

Objective

To combine visual object detection techniques with medical ontology to automatically mine web photos and retrieve a large number of disease manifestation images with minimal manual labeling effort.

Methods

As a proof of concept, we first learnt five organ detectors on three detection scales for eyes, ears, lips, hands, and feet. Given a disease, we used information from the UMLS to select affected body parts, ran the pretrained organ detectors on web images, and combined the detection outputs to retrieve disease images.

Results

Compared with a supervised image retrieval approach that requires training images for every disease, our ontology-guided approach exploits shared visual information of body parts across diseases. In retrieving 2220 web images of 32 diseases, we reduced manual labeling effort to 15.6% while improving the average precision by 3.9% from 77.7% to 81.6%. For 40.6% of the diseases, we improved the precision by 10%.

Conclusions

The results confirm the concept that the web is a feasible source for automatic disease image retrieval for health image database construction. Our approach requires a small amount of manual effort to collect complex disease images, and to annotate them by standard medical ontology terms.  相似文献   

17.

Objective

An accurate computable representation of food and drug allergy is essential for safe healthcare. Our goal was to develop a high-performance, easily maintained algorithm to identify medication and food allergies and sensitivities from unstructured allergy entries in electronic health record (EHR) systems.

Materials and methods

An algorithm was developed in Transact-SQL to identify ingredients to which patients had allergies in a perioperative information management system. The algorithm used RxNorm and natural language processing techniques developed on a training set of 24 599 entries from 9445 records. Accuracy, specificity, precision, recall, and F-measure were determined for the training dataset and repeated for the testing dataset (24 857 entries from 9430 records).

Results

Accuracy, precision, recall, and F-measure for medication allergy matches were all above 98% in the training dataset and above 97% in the testing dataset for all allergy entries. Corresponding values for food allergy matches were above 97% and above 93%, respectively. Specificities of the algorithm were 90.3% and 85.0% for drug matches and 100% and 88.9% for food matches in the training and testing datasets, respectively.

Discussion

The algorithm had high performance for identification of medication and food allergies. Maintenance is practical, as updates are managed through upload of new RxNorm versions and additions to companion database tables. However, direct entry of codified allergy information by providers (through autocompleters or drop lists) is still preferred to post-hoc encoding of the data. Data tables used in the algorithm are available for download.

Conclusions

A high performing, easily maintained algorithm can successfully identify medication and food allergies from free text entries in EHR systems.  相似文献   

18.

Objective

A common measure of Internet search engine effectiveness is its ability to find documents that a user perceives as ‘relevant’. This study sought to test whether user provided relevance ratings for documents retrieved by an Internet search engine correlate with the decision outcome after use of a search engine.

Design

227 university students were asked to answer four randomly assigned consumer health questions, then to conduct an Internet search on one of two randomly assigned search engines of different performance, and to again answer the question.

Measurements

Participants were asked to provide a relevance score for each document retrieved as well as a pre and post search answer to each question.

Results

User relevance rankings had little or no predictive power. Relevance rankings were unable to predict whether the user of a search engine could correctly answer a question after search and could not differentiate between two search engines with statistically different performance in the hands of users. Only when users had strong prior knowledge of the questions, and the decision task was of low complexity, did relevance appear to have modest predictive power.

Conclusions

User provided relevance rankings taken in isolation seem to be of limited to no value when designing a search engine that will be used in a general-purpose setting. Relevance rankings may have a place in situations in which experts provide rankings, and decision tasks are of complexity commensurate with the abilities of the raters. A more natural metric of search engine performance may be a user''s ability to accurately complete a task, as this removes the inherent subjectivity of relevance rankings, and provides a direct and repeatable outcome measure which directly correlates with the performance of the search technology in the hands of users.  相似文献   

19.

Context

TimeText is a temporal reasoning system designed to represent, extract, and reason about temporal information in clinical text.

Objective

To measure the accuracy of the TimeText for processing clinical discharge summaries.

Design

Six physicians with biomedical informatics training served as domain experts. Twenty discharge summaries were randomly selected for the evaluation. For each of the first 14 reports, 5 to 8 clinically important medical events were chosen. The temporal reasoning system generated temporal relations about the endpoints (start or finish) of pairs of medical events. Two experts (subjects) manually generated temporal relations for these medical events. The system and expert-generated results were assessed by four other experts (raters). All of the twenty discharge summaries were used to assess the system’s accuracy in answering time-oriented clinical questions. For each report, five to ten clinically plausible temporal questions about events were generated. Two experts generated answers to the questions to serve as the gold standard. We wrote queries to retrieve answers from system’s output.

Measurements

Correctness of generated temporal relations, recall of clinically important relations, and accuracy in answering temporal questions.

Results

The raters determined that 97% of subjects’ 295 generated temporal relations were correct and that 96.5% of the system’s 995 generated temporal relations were correct. The system captured 79% of 307 temporal relations determined to be clinically important by the subjects and raters. The system answered 84% of the temporal questions correctly.

Conclusion

The system encoded the majority of information identified by experts, and was able to answer simple temporal questions.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号