首页 | 本学科首页   官方微博 | 高级检索  
     


PhenDisco: phenotype discovery system for the database of genotypes and phenotypes
Authors:Son Doan  Ko-Wei Lin  Mike Conway  Lucila Ohno-Machado  Alex Hsieh  Stephanie Feudjio Feupe  Asher Garland  Mindy K Ross  Xiaoqian Jiang  Seena Farzaneh  Rebecca Walker  Neda Alipanah  Jing Zhang  Hua Xu  Hyeon-Eui Kim
Affiliation:1.Division of Biomedical Informatics, University of California San Diego, La Jolla, California, USA;2.School of Biomedical Informatics, The University of Texas Health Science Center at, Houston, Houston, Texas, USA
Abstract:The database of genotypes and phenotypes (dbGaP) developed by the National Center for Biotechnology Information (NCBI) is a resource that contains information on various genome-wide association studies (GWAS) and is currently available via NCBI''s dbGaP Entrez interface. The database is an important resource, providing GWAS data that can be used for new exploratory research or cross-study validation by authorized users. However, finding studies relevant to a particular phenotype of interest is challenging, as phenotype information is presented in a non-standardized way. To address this issue, we developed PhenDisco (phenotype discoverer), a new information retrieval system for dbGaP. PhenDisco consists of two main components: (1) text processing tools that standardize phenotype variables and study metadata, and (2) information retrieval tools that support queries from users and return ranked results. In a preliminary comparison involving 18 search scenarios, PhenDisco showed promising performance for both unranked and ranked search comparisons with dbGaP''s search engine Entrez. The system can be accessed at http://pfindr.net.
Keywords:DBGAP   Phenotype Standardization   Information Retrieval   Text Mining   GWAS   Natural Language Processing
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号