首页 | 本学科首页   官方微博 | 高级检索  
检索        


Design of association studies with pooled or un‐pooled next‐generation sequencing data
Authors:Su Yeon Kim  Yingrui Li  Yiran Guo  Ruiqiang Li  Johan Holmkvist  Torben Hansen  Oluf Pedersen  Jun Wang  Rasmus Nielsen
Institution:1. Departments of Integrative Biology and Statistics, UC Berkeley, Berkeley, California;2. Beijing Genomics Institute, Shenzhen, China;3. Hagedorn Research Institute, Gentofte, Denmark;4. Faculty of Health Science, University of Southern Denmark, Odense, Denmark;5. Faculty of Health Science, University of Aarhus, Aarhus, Denmark;6. Institute of Biomedical Science, University of Copenhagen, Denmark;7. Department of Biology, University of Copenhagen, Copenhagen, Denmark
Abstract:Most common hereditary diseases in humans are complex and multifactorial. Large‐scale genome‐wide association studies based on SNP genotyping have only identified a small fraction of the heritable variation of these diseases. One explanation may be that many rare variants (a minor allele frequency, MAF <5%), which are not included in the common genotyping platforms, may contribute substantially to the genetic variation of these diseases. Next‐generation sequencing, which would allow the analysis of rare variants, is now becoming so cheap that it provides a viable alternative to SNP genotyping. In this paper, we present cost‐effective protocols for using next‐generation sequencing in association mapping studies based on pooled and un‐pooled samples, and identify optimal designs with respect to total number of individuals, number of individuals per pool, and the sequencing coverage. We perform a small empirical study to evaluate the pooling variance in a realistic setting where pooling is combined with exon‐capturing. To test for associations, we develop a likelihood ratio statistic that accounts for the high error rate of next‐generation sequencing data. We also perform extensive simulations to determine the power and accuracy of this method. Overall, our findings suggest that with a fixed cost, sequencing many individuals at a more shallow depth with larger pool size achieves higher power than sequencing a small number of individuals in higher depth with smaller pool size, even in the presence of high error rates. Our results provide guidelines for researchers who are developing association mapping studies based on next‐generation sequencing. Genet. Epidemiol. 34: 479–491, 2010. © 2010 Wiley‐Liss, Inc.
Keywords:pooled samples  association mapping  rare allele  optimal design  next‐generation sequencing
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号