首页 | 本学科首页   官方微博 | 高级检索  
     


Quantifying the amount of missing information in genetic association studies
Authors:Nicolae Dan L
Affiliation:Departments of Medicine and Statistics, The University of Chicago, Chicago, Illinois 60637, USA. nicolae@galton.uchicago.edu
Abstract:
Many genetic analyses are done with incomplete information; for example, unknown phase in haplotype-based association studies. Measures of the amount of available information can be used for efficient planning of studies and/or analyses. In particular, the linkage disequilibrium (LD) between two sets of markers can be interpreted as the amount of information one set of markers contains for testing allele frequency differences in the second set, and measuring LD can be viewed as quantifying information in a missing data problem. We introduce a framework for measuring the association between two sets of variables; for example, genotype data for two distinct groups of markers, or haplotype and genotype data for a given set of polymorphisms. The goal is to quantify how much information is in one data set, e.g. genotype data for a set of SNPs, for estimating parameters that are functions of frequencies in the second data set, e.g. haplotype frequencies, relative to the ideal case of actually observing the complete data, e.g. haplotypes. In the case of genotype data on two mutually exclusive sets of markers, the measure determines the amount of multi-locus LD, and is equal to the classical measure r(2), if the sets consist each of one bi-allelic marker. In general, the measures are interpreted as the asymptotic ratio of sample sizes necessary to achieve the same power in case-control testing. The focus of this paper is on case-control allele/haplotype tests, but the framework can be extended easily to other settings like regressing quantitative traits on allele/haplotype counts, or tests on genotypes or diplotypes. We highlight applications of the approach, including tools for navigating the HapMap database [The International HapMap Consortium, 2003], and genotyping strategies for positional cloning studies.
Keywords:information content  multi‐locus linkage disequilibrium  asymptotic relative efficiency  association testing  case‐control design
本文献已被 PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号