首页 | 本学科首页   官方微博 | 高级检索  
     

基于电子病历的胃癌治疗方案辅助选择
引用本文:夏 冬,李国垒,陈先来. 基于电子病历的胃癌治疗方案辅助选择[J]. 中华医学图书情报杂志, 2018, 27(2): 63-68
作者姓名:夏 冬  李国垒  陈先来
作者单位:中国科学院成都文献情报中心,四川 成都610000;中南大学信息安全与大数据研究院,湖南 长沙 410083,中国医学科学院医学信息研究所,北京 100020;中南大学信息安全与大数据研究院,湖南 长沙 410083,中南大学信息安全与大数据研究院,湖南 长沙 410083;医学信息研究湖南省普通高等学校重点实验室,湖南 长沙 410013
基金项目:国家社科基金项“面向临床决策的电子病历潜在语义分析及应用研究”(13BTQ052)的研究成果之一
摘    要:目的:通过挖掘电子病历文本中的信息,探索有效的文本挖掘方法,以期实现电子病历的决策支持价值。方法:将2500份胃癌患者电子病历随机分为训练组和测试组,利用词典结合统计的方法对训练组病历文本进行分词,根据每个切分词与从病历中抽取的治疗方案的共现频次对切分词进行聚类,统计训练组病历中的文本在各个聚类中词的匹配数,并以训练组病历文本在各类中的匹配词数和治疗方案建立起Bayes判别函数作为决策支持模型,对测试组病历进行验证,对分词方法及判别模型进行评价。结果:随机抽取50份发现分词召回率为74.24%,准确率为82.30%,F-1值为78.06%。在切分词聚为五类时,所建立的判别模型对测试组病历的判定准确率为62%。结论:词典结合统计的分词方法在电子病历文本分词中的效果较好,基于聚类的电子病历文本挖掘可实现病历的决策支持价值,但建立的决策支持模型准确度不高,仍需对建模过程中病历文本分词及切分词的处理进行进一步研究。

关 键 词:分词;聚类分析;Bayes判别;电子病历;临床决策支持;胃癌
收稿时间:2017-12-13

Electronic medical records-based selection of gastric cancer treatment plans
XIA Dong,LI Guo-lei and CHEN Xian-lai. Electronic medical records-based selection of gastric cancer treatment plans[J]. Chinese Journal of Medical Library and Information Science, 2018, 27(2): 63-68
Authors:XIA Dong  LI Guo-lei  CHEN Xian-lai
Affiliation:Chengdu Literature and Information Center, Chinese Academy of Sciences, Chengdu 610000, Sichuan Province, China;Institute of Information Security and Big Data, Central South University, Changsha 410083, Hunan Province, China,Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China; Institute of Information Security and Big Data, Central South University, Changsha 410083, Hunan Province, China and Institute of Information Security and Big Data, Central South University, Changsha 410083, Hunan Province, China;Key Laboratory of Medical Information Research, Hunan General Colleges and Universities, Changsha 410013, Hunan Province, China
Abstract:Objective To study the effective text mining methods by mining the information in electronic medical records(EMR) in order to achieve their value in support of decision-making. Methods Two thousand and five hundred EMR of gastric cancer patients were randomly divided into training group (n=1500) and testing group(n=1000). The words in the text of EMR of training group were identified using dictionary in combination with statistical methods. The segmented words were clustered according to the co-occurrence frequency of each segmented word and the treatment plan extracted from EMR. The matched number of words in each cluster from the text of EMR of training group was recorded. A decision-making support model of Bayes discrimination function was established according to the matched number of words in each cluster from the text of EMR of training group and treatment plan to verify the EMR in training group and to evaluate the words segmenting methods and the discrimination model.Results Fifty randomly selected RME showed that the recall rate, accurate rate and F-1 value of segmented words were 74.24%, 82.30% and 78.06% respectively. The accurate rate of the established discrimination model was 62% for the identification of EMR of testing group when the segmented words were clustered into 5 categories. Conclusion The efficiency of dictionary in combination with statistical methods is good for identifying words from the text of EMR. Cluster-based text mining of EMR can achieve the decision-making support value of EMR, but the accuracy of the established decision-making support model is not as high as expected. Further study is thus necessary to identify the words from the text of EMR and the process of segmented words in establishing the decision-making support model.
Keywords:Segmented words   Cluster analysis   Bayes discrimination   EMR   Clinical decision-making support   Gastric cancer
点击此处可从《中华医学图书情报杂志》浏览原始摘要信息
点击此处可从《中华医学图书情报杂志》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号