共查询到17条相似文献,搜索用时 62 毫秒
1.
系统介绍了生物医学文本挖掘的具体流程和文本挖掘技术在生物医学领域中的应用情况,并着重从自然语言处理和本体、命名实体识别、关系抽取、文本分类与聚类、共现分析、系统工具及评价、可视化等方面分别做了阐述. 相似文献
2.
3.
崔雷 《中华医学图书情报杂志》2017,26(3):1-5
介绍了生物医学领域里的文本挖掘研究的步骤及各个步骤中所采用的方法,重点介绍了各个步骤中所用的工具和案例,以期促进生物医学文本挖掘研究的开展。 相似文献
4.
5.
目的:利用深度学习方法自动抽取中文生物医学文本中的开放式概念关系,以增强生物医学文本理解及医学知识网络构建。方法:使用BiLSTM-CRF模型从中文生物医学文献数据中抽取以句子上下文短语描述的开放式概念关系,并与基于条件随机场(Conditional Random Fields,CRF)和基于长短时记忆网络(Long Short-Term Memory,LSTM)的方法进行对比分析。结果:基于BiLSTM-CRF的中文生物医学开放式概念关系抽取方法取得F1值为0.5221,显著高于基于CRF模型的方法(F1值为0.2353)和基于LSTM模型的方法(F1值为0.3355)。结论:与单独使用CRF模型或LSTM模型的方法相比,基于BiLSTM-CRF的开放式概念关系抽取方法具有更好的鲁棒性和泛化性,对于生物医学文本理解、医学知识网络构建等研究具有借鉴意义。 相似文献
6.
7.
对生物医学文本研究背景进行了概述,并介绍了两种生物医学文本挖掘工具——COREMINE medical和Chilibot,在此基础上利用这两种工具对白血病和基因的相互作用关系进行探讨,最终得出具体的相互作用关系的结论。 相似文献
8.
9.
10.
介绍3类国内外生物医学领域本体网络整合工具的研究成果,包括生物医学本体网络整合平台、疾病-药物本体知识发现工具、基因-蛋白质本体集成分析工具,分析其特点及不足,总结本体整合工具开发过程中应该注意的问题,希望能为相关研究者提供借鉴。 相似文献
11.
Tiantian Zhu Yang Qin Yang Xiang Baotian Hu Qingcai Chen Weihua Peng 《J Am Med Inform Assoc》2021,28(12):2571
ObjectiveThere have been various methods to deal with the erroneous training data in distantly supervised relation extraction (RE), however, their performance is still far from satisfaction. We aimed to deal with the insufficient modeling problem on instance-label correlations for predicting biomedical relations using deep learning and reinforcement learning.Materials and MethodsIn this study, a new computational model called piecewise attentive convolutional neural network and reinforcement learning (PACNN+RL) was proposed to perform RE on distantly supervised data generated from Unified Medical Language System with MEDLINE abstracts and benchmark datasets. In PACNN+RL, PACNN was introduced to encode semantic information of biomedical text, and the RL method with memory backtracking mechanism was leveraged to alleviate the erroneous data issue. Extensive experiments were conducted on 4 biomedical RE tasks.ResultsThe proposed PACNN+RL model achieved competitive performance on 8 biomedical corpora, outperforming most baseline systems. Specifically, PACNN+RL outperformed all baseline methods with the F1-score of on the may-prevent dataset, on the may-treat dataset, and on the DDI corpus, 2011. For the protein-protein interaction RE task, we obtained new state-of-the-art performance on 4 out of 5 benchmark datasets.ConclusionsThe performance on many distantly supervised biomedical RE tasks was substantially improved, primarily owing to the denoising effect of the proposed model. It is anticipated that PACNN+RL will become a useful tool for large-scale RE and other downstream tasks to facilitate biomedical knowledge acquisition. We also made the demonstration program and source code publicly available at http://112.74.48.115:9000/. 相似文献
12.
随着信息技术的发展,采集、存储和管理数据的手段日益完善,数据挖掘学科应运而生。文章阐述数据挖掘的概念;通过给出各种数据挖掘方法在生物医学研究领域中的应用实例,分析数据挖掘与生物医学领域中统计学的关系,并就国内生物医学数据挖掘的应用现状、需要解决的问题以及今后研究的发展方向等进行综述。 相似文献
13.
随着信息技术的发展,采集、存储和管理数据的手段日益完善,数据挖掘学科应运而生。文章阐述数据挖掘的概念;通过给出各种数据挖掘方法在生物医学研究领域中的应用实例,分析数据挖掘与生物医学领域中统计学的关系,并就国内生物医学数据挖掘的应用现状、需要解决的问题以及今后研究的发展方向等进行综述。 相似文献
14.
Aleksandar Kova?evi? Azad Dehghan Michele Filannino John A Keane Goran Nenadic 《J Am Med Inform Assoc》2013,20(5):859-866
Objective
Identification of clinical events (eg, problems, tests, treatments) and associated temporal expressions (eg, dates and times) are key tasks in extracting and managing data from electronic health records. As part of the i2b2 2012 Natural Language Processing for Clinical Data challenge, we developed and evaluated a system to automatically extract temporal expressions and events from clinical narratives. The extracted temporal expressions were additionally normalized by assigning type, value, and modifier.Materials and methods
The system combines rule-based and machine learning approaches that rely on morphological, lexical, syntactic, semantic, and domain-specific features. Rule-based components were designed to handle the recognition and normalization of temporal expressions, while conditional random fields models were trained for event and temporal recognition.Results
The system achieved micro F scores of 90% for the extraction of temporal expressions and 87% for clinical event extraction. The normalization component for temporal expressions achieved accuracies of 84.73% (expression''s type), 70.44% (value), and 82.75% (modifier).Discussion
Compared to the initial agreement between human annotators (87–89%), the system provided comparable performance for both event and temporal expression mining. While (lenient) identification of such mentions is achievable, finding the exact boundaries proved challenging.Conclusions
The system provides a state-of-the-art method that can be used to support automated identification of mentions of clinical events and temporal expressions in narratives either to support the manual review process or as a part of a large-scale processing of electronic health databases. 相似文献15.
16.
目的:构建中文生物医学实体及关系的自动识别标注平台,为中文生物医学语料标注和精准医学语料积累及知识服务等提供参考。方法:基于词典和CRF算法实现中文生物医学文本的自动实体识别,利用Python、JavaScript、CSS等编程语言和Query框架等相关工具构建中文生物医学实体自动标注平台。结果:构建了一个可以自动识别中文实体且具备上传、标注、审核文本并最终存储文本等功能的中文自动标注平台。该平台能高效、准确地识别文本内容,实现自动标注。结论:该平台具备了人工导入文献、标注、管理员审核结算的功能,可以为生物医学领域的研究者进行信息的数据挖掘、中文语料库的构建提供支持。 相似文献
17.
Gregory F Cooper Ivet Bahar Michael J Becich Panayiotis V Benos Jeremy Berg Jeremy U Espino Clark Glymour Rebecca Crowley Jacobson Michelle Kienholz Adrian V Lee Xinghua Lu Richard Scheines and the Center for Causal Discovery team 《J Am Med Inform Assoc》2015,22(6):1132-1136
The Big Data to Knowledge (BD2K) Center for Causal Discovery is developing and disseminating an integrated set of open source tools that support causal modeling and discovery of biomedical knowledge from large and complex biomedical datasets. The Center integrates teams of biomedical and data scientists focused on the refinement of existing and the development of new constraint-based and Bayesian algorithms based on causal Bayesian networks, the optimization of software for efficient operation in a supercomputing environment, and the testing of algorithms and software developed using real data from 3 representative driving biomedical projects: cancer driver mutations, lung disease, and the functional connectome of the human brain. Associated training activities provide both biomedical and data scientists with the knowledge and skills needed to apply and extend these tools. Collaborative activities with the BD2K Consortium further advance causal discovery tools and integrate tools and resources developed by other centers. 相似文献