首页 | 本学科首页   官方微博 | 高级检索  
检索        


Reuse of termino-ontological resources and text corpora for building a multilingual domain ontology: An application to Alzheimer’s disease
Institution:Univ. Bordeaux, ISPED, Centre INSERM U897-Epidemiologie-Biostatistique, F-33000 Bordeaux, France
Abstract:Ontologies are useful tools for sharing and exchanging knowledge. However ontology construction is complex and often time consuming. In this paper, we present a method for building a bilingual domain ontology from textual and termino-ontological resources intended for semantic annotation and information retrieval of textual documents. This method combines two approaches: ontology learning from texts and the reuse of existing terminological resources. It consists of four steps: (i) term extraction from domain specific corpora (in French and English) using textual analysis tools, (ii) clustering of terms into concepts organized according to the UMLS Metathesaurus, (iii) ontology enrichment through the alignment of French and English terms using parallel corpora and the integration of new concepts, (iv) refinement and validation of results by domain experts. These validated results are formalized into a domain ontology dedicated to Alzheimer’s disease and related syndromes which is available online (http://lesim.isped.u-bordeaux2.fr/SemBiP/ressources/ontoAD.owl). The latter currently includes 5765 concepts linked by 7499 taxonomic relationships and 10,889 non-taxonomic relationships. Among these results, 439 concepts absent from the UMLS were created and 608 new synonymous French terms were added. The proposed method is sufficiently flexible to be applied to other domains.
Keywords:Ontology development  Alzheimer’s disease  Ontological resource reuse  Term alignment  Parallel corpus
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号