首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MEDSYNDIKATE is a natural language processor, which automatically acquires medical information from findings reports. In the course of text analysis their contents is transferred to conceptual representation structures, which constitute a corresponding text knowledge base. MEDSYNDIKATE is particularly adapted to deal properly with text structures, such as various forms of anaphoric reference relations spanning several sentences. The strong demands MEDSYNDIKATE poses on the availability of expressive knowledge sources are accounted for by two alternative approaches to acquire medical domain knowledge (semi)automatically. We also present data for the information extraction performance of MEDSYNDIKATE in terms of the semantic interpretation of three major syntactic patterns in medical documents.  相似文献   

2.
Information extraction for enhanced access to disease outbreak reports   总被引:1,自引:0,他引:1  
Document search is generally based on individual terms in the document. However, for collections within limited domains it is possible to provide more powerful access tools. This paper describes a system designed for collections of reports of infectious disease outbreaks. The system, Proteus-BIO, automatically creates a table of outbreaks, with each table entry linked to the document describing that outbreak; this makes it possible to use database operations such as selection and sorting to find relevant documents. Proteus-BIO consists of a Web crawler which gathers relevant documents; an information extraction engine which converts the individual outbreak events to a tabular database; and a database browser which provides access to the events and, through them, to the documents. The information extraction engine uses sets of patterns and word classes to extract the information about each event. Preparing these patterns and word classes has been a time-consuming manual operation in the past, but automated discovery tools now make this task significantly easier. A small study comparing the effectiveness of the tabular index with conventional Web search tools demonstrated that users can find substantially more documents in a given time period with Proteus-BIO.  相似文献   

3.
This paper describes an information extraction system that extracts and converts the available information in free text Turkish radiology reports into a structured information model using manually created extraction rules and domain ontology. The ontology provides flexibility in the design of extraction rules, and determines the information model for the extracted semantic information. Although our information extraction system mainly concentrates on abdominal radiology reports, the system can be used in another field of medicine by adapting its ontology and extraction rule set. We achieved very high precision and recall results during the evaluation of the developed system with unseen radiology reports.  相似文献   

4.
ObjectivesData extraction from original study reports is a time-consuming, error-prone process in systematic review development. Information extraction (IE) systems have the potential to assist humans in the extraction task, however majority of IE systems were not designed to work on Portable Document Format (PDF) document, an important and common extraction source for systematic review. In a PDF document, narrative content is often mixed with publication metadata or semi-structured text, which add challenges to the underlining natural language processing algorithm. Our goal is to categorize PDF texts for strategic use by IE systems.MethodsWe used an open-source tool to extract raw texts from a PDF document and developed a text classification algorithm that follows a multi-pass sieve framework to automatically classify PDF text snippets (for brevity, texts) into TITLE, ABSTRACT, BODYTEXT, SEMISTRUCTURE, and METADATA categories. To validate the algorithm, we developed a gold standard of PDF reports that were included in the development of previous systematic reviews by the Cochrane Collaboration. In a two-step procedure, we evaluated (1) classification performance, and compared it with machine learning classifier, and (2) the effects of the algorithm on an IE system that extracts clinical outcome mentions.ResultsThe multi-pass sieve algorithm achieved an accuracy of 92.6%, which was 9.7% (p < 0.001) higher than the best performing machine learning classifier that used a logistic regression algorithm. F-measure improvements were observed in the classification of TITLE (+15.6%), ABSTRACT (+54.2%), BODYTEXT (+3.7%), SEMISTRUCTURE (+34%), and MEDADATA (+14.2%). In addition, use of the algorithm to filter semi-structured texts and publication metadata improved performance of the outcome extraction system (F-measure +4.1%, p = 0.002). It also reduced of number of sentences to be processed by 44.9% (p < 0.001), which corresponds to a processing time reduction of 50% (p = 0.005).ConclusionsThe rule-based multi-pass sieve framework can be used effectively in categorizing texts extracted from PDF documents. Text classification is an important prerequisite step to leverage information extraction from PDF documents.  相似文献   

5.
BackgroundAnaphoric references occur ubiquitously in clinical narrative text. However, the problem, still very much an open challenge, is typically less aggressively focused on in clinical text domain applications. Furthermore, existing research on reference resolution is often conducted disjointly from real-world motivating tasks.ObjectiveIn this paper, we present our machine-learning system that automatically performs reference resolution and a rule-based system to extract tumor characteristics, with component-based and end-to-end evaluations. Specifically, our goal was to build an algorithm that takes in tumor templates and outputs tumor characteristic, e.g. tumor number and largest tumor sizes, necessary for identifying patient liver cancer stage phenotypes.ResultsOur reference resolution system reached a modest performance of 0.66 F1 for the averaged MUC, B-cubed, and CEAF scores for coreference resolution and 0.43 F1 for particularization relations. However, even this modest performance was helpful to increase the automatic tumor characteristics annotation substantially over no reference resolution.ConclusionExperiments revealed the benefit of reference resolution even for relatively simple tumor characteristics variables such as largest tumor size. However we found that different overall variables had different tolerances to reference resolution upstream errors, highlighting the need to characterize systems by end-to-end evaluations.  相似文献   

6.
Natural language processing (NLP) is critical for improvement of the healthcare process because it can encode clinical data in patient documents. Many clinical applications such as decision support require coded data to function appropriately. However, in order to be applicable for healthcare, performance must be adequate. A valuable automated application is the detection of infectious diseases, such as surveillance of pneumonia in newborns (e.g., neonates) because the disease produces significant rates of morbidity and mortality, and manual surveillance is challenging. Studies have demonstrated that automated surveillance using NLP is a useful adjunct to manual surveillance and an effective tool for infection control practitioners. This paper presents a study evaluating the feasibility of an NLP-based monitoring system to screen for healthcare-associated pneumonia in neonates. We estimated sensitivity, specificity, and positive predictive value by comparing results with clinicians' judgments. Sensitivity was 71% and specificity was 99%. Our results demonstrated that the automated method was feasible.  相似文献   

7.
8.
9.
Communication of follow-up recommendations when abnormalities are identified on imaging studies is prone to error. The absence of an automated system to identify and track radiology recommendations is an important barrier to ensuring timely follow-up of patients especially with non-acute incidental findings on imaging examinations. In this paper, we present a text processing pipeline to automatically identify clinically important recommendation sentences in radiology reports. Our extraction pipeline is based on natural language processing (NLP) and supervised text classification methods. To develop and test the pipeline, we created a corpus of 800 radiology reports double annotated for recommendation sentences by a radiologist and an internist. We ran several experiments to measure the impact of different feature types and the data imbalance between positive and negative recommendation sentences. Our fully statistical approach achieved the best f-score 0.758 in identifying the critical recommendation sentences in radiology reports.  相似文献   

10.
11.
More than 10 years has passed since the concept of picture archiving and communication systems (PACS) was first proposed. A great deal of effort has been expended to make PACS suitable for routine use in clinical settings, but only a few systems are currently used in this manner. A major reason is the lack of the assurance of throughput equivalent to that of a conventional system based on order sheets and analog films. In this report, two techniques to increase throughput have been introduced and studied. The first is the preloading of data elements from the various information systems and the PACS. The second is the use of the priority information to rank order the examinations placed on the list for interpretation. We have applied these techniques to an actual system and have measured the distribution of time for processing examinations. These two techniques appear to make PACS useful in routine practice, because most of the urgent cases were interpreted within the target time of 40 minutes.  相似文献   

12.
13.
Background and aimsMachine learning techniques for the text mining of cancer-related clinical documents have not been sufficiently explored. Here some techniques are presented for the pre-processing of free-text breast cancer pathology reports, with the aim of facilitating the extraction of information relevant to cancer staging.Materials and methodsThe first technique was implemented using the freely available software RapidMiner to classify the reports according to their general layout: ‘semi-structured’ and ‘unstructured’. The second technique was developed using the open source language engineering framework GATE and aimed at the prediction of chunks of the report text containing information pertaining to the cancer morphology, the tumour size, its hormone receptor status and the number of positive nodes. The classifiers were trained and tested respectively on sets of 635 and 163 manually classified or annotated reports, from the Northern Ireland Cancer Registry.ResultsThe best result of 99.4% accuracy – which included only one semi-structured report predicted as unstructured – was produced by the layout classifier with the k nearest algorithm, using the binary term occurrence word vector type with stopword filter and pruning. For chunk recognition, the best results were found using the PAUM algorithm with the same parameters for all cases, except for the prediction of chunks containing cancer morphology. For semi-structured reports the performance ranged from 0.97 to 0.94 and from 0.92 to 0.83 in precision and recall, while for unstructured reports performance ranged from 0.91 to 0.64 and from 0.68 to 0.41 in precision and recall. Poor results were found when the classifier was trained on semi-structured reports but tested on unstructured.ConclusionsThese results show that it is possible and beneficial to predict the layout of reports and that the accuracy of prediction of which segments of a report may contain certain information is sensitive to the report layout and the type of information sought.  相似文献   

14.
背景:随着医学影像技术的发展,医学数字图像和通信标准早期的信息对象定义不能突显出新图像的序列特点。 目的:分析增强型信息对象的新技术,新结构和新机制,寻找一种方法,能正常浏览增强型信息对象,获取相关信息。 方法:依照医学数字图像和通信标准分析增强型信息对象的结构,基于面向对象编程,实现图像浏览,获取图像对象的详细信息。 结果与结论:该程序能够便捷流畅地浏览增强型信息对象并获取到它们的标签信息。利用本程序,可以改进优化现有的医学图像处理软件,所获取信息为三维重建和挂片协议等提供有力技术支撑。  相似文献   

15.
A bar-code terminal network under software control of a microcomputer was added to the minicomputer-based radiology information system at the Medical College of Georgia in Augusta. The bar-code network was specifically installed to address the inherent inaccuracies occurring when procedure information was entered at the time of registration before procedures were actually performed. Technologists now enter procedure data into bar-code terminals after procedures are performed, substantially reducing database errors. This approach allowed us to take advantage of a microcomputer product without the necessity of completely converting our highly customized information system software from mini-to microcomputer.  相似文献   

16.
Installation of a radiology information management system (RIS) is usually justified on the basis of improved departmental efficiency and improved charge capture. However, evaluation of the success of these expected improvements is often difficult. The installation and operation of such a system in a medium-sized tertiary care hospital has permitted the effects of the RIS on the operation of the department to be studied and the improvements in charge capture provided by the system to be quantitatively assessed. As a result of a side-by-side comparison with a conventional check-sheet manual billing system, it is apparent that the RIS reduces the errors inherent in manual systems. Subjectively, it is also apparent that personnel prefer the computerized system to the manual charge sheets.  相似文献   

17.
An information system has been established at the National Institute for Biological Standards & Control for the exchange of knowledge in AIDS research, particularly in relation to vaccine design. This system, the AIDS information exchange link (Ariel), is designed to act as a central store of relevant information for scientists in the UK and abroad and was set up under the auspices of the Medical Research Council. It holds information on research materials (reagents), on genetic sequences and on projects. Several computers and database systems are involved. Access is obtained through Janet, the UK academic network, or PSS, the British Telecom public network. Both these networks are linked to international network systems, e.g. Internet, Earn, IPSS. Ariel has been in operation for 18 months and is accessed internationally.  相似文献   

18.
19.

Background  

This paper proposes the use of decision trees as the basis for automatically extracting information from published randomized controlled trial (RCT) reports. An exploratory analysis of RCT abstracts is undertaken to investigate the feasibility of using decision trees as a semantic structure. Quality-of-paper measures are also examined.  相似文献   

20.
Unknown to most radiology professionals, the Veterans Administration (VA) is implementing an automated radiology information system as an integrated component of its Decentralized Hospital Computer Program. The basic design has been evaluated and refined over the past 5 years. It is now becoming available in all 172 VA medical facilities. Radiology services are provided in a complex management and fiscal environment. The primary purpose of the information system is to improve the efficient processing, performance, and reporting of requests for radiologic consultations and procedures. The automatic capturing of demographic and medical statistics will provide local and national managers more complete data with which to plan future financial, equipment, and personnel requirements. The VA radiology module has the potential to influence the shape of all future systems, commercial and public. This report describes the development of this radiology information system, its current status, and its potential impact on the largest health care system in the country. The module serves as an example of what can or should be expected from the radiology portion of a comprehensive medical information management system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号