首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Association studies depend on linkage disequilibrium (LD) between a causative mutation and linked marker loci. Selecting markers that give the best chance of showing useful levels of LD with the causative mutation will increase the chances of successfully detecting an association. This report examines the variation in the extent of LD between a disease locus and one or two diallelic marker loci (termed single nucleotide polymorphisms or SNPs). We use a simulation method based on the neutral coalescent in a population of variable size to find the distribution of LD as a function of allele frequencies, the recombination rate, and the population history. Given that LD exists, the allele frequencies determine if a site will be useful for detecting an association with the disease mutation. We show that there is extensive variation in LD even for closely linked loci, implying that several markers may be needed to detect a disease locus. The distribution of LD between common variants is strongly influenced by ancestral population size. We show that in general, best results will be obtained if the frequencies of marker alleles are at least as large as the frequency of the causative mutation. Haplotypes of two or more SNPs generally have a higher probability than individual SNPs of showing useful LD with a disease mutation, although exceptions are described.  相似文献   

2.
Wu S  Yang J  Wu R 《Statistics in medicine》2006,25(22):3826-3849
The time-dependent change of HIV particle load, i.e. HIV dynamics, is likely to be controlled by a multitude of quantitative trait loci (QTL) that interact with each other as well as with various developmental and environmental factors in a coordinated manner. In this article, we have derived a new statistical model for mapping the epistatic QTL responsible for HIV dynamics in a natural human population. This model, constructed on the integrated theme of functional mapping and linkage disequilibrium (LD) mapping, can make use of information from multiple markers genotyped from the human genome. It allows for the test and estimation of genetic actions and interactions involved in the control of HIV progression and provides a general platform to identify the detailed genetic architecture of resistance or susceptibility of humans to HIV on a dynamic scale. We have generalized this model to accommodate various complicated clincal designs for AIDS studies. Simulation studies with different scenarios are performed to examine the statistical behaviour of the model. The genetic and statistical extensions of this mapping model to HIV/AIDS genomic research are discussed.  相似文献   

3.
Case-control designs are commonly adopted in genetic epidemiological studies because they are cost effective and offer powerful tests for genetic and environmental risk factors, as well as their interactions. Previously, we proposed an association mapping approach to estimate the position of an unobserved disease locus as well as measuring its genetic effect on risk. The method provides a confidence interval for the estimated map position to help narrow the chromosomal region potentially harboring a disease locus. However, concerns often rise about case-control designs including possible false positives or bias due to confounders, heterogeneity or interactions among genes and between genes and environments. In the present work, we extended the multipoint linkage disequilibrium mapping approach for case-control studies to incorporate information about factors influencing the effect of causal genes to improve precision and efficiency of the estimated location. The efficiency, bias and coverage probability of this extended approach for locating a disease locus using case-control data with and without additional information on a covariate were compared through simulation. An example of a case-control study for type 2 diabetes was used to illustrate this extended method. In this study, a strong association between diabetes and a candidate gene, SCL2A10, was detected among nonobese subjects, whereas no evidence of association was found for either obese subjects or the whole sample when obesity was ignored. Simulation studies and these diabetes data both demonstrate how the efficiency of the estimated location of a disease gene can be improved substantially by incorporating information on covariates.  相似文献   

4.
Jung J  Zhong M  Liu L  Fan R 《Genetic epidemiology》2008,32(5):396-412
In this paper, bivariate/multivariate variance component models are proposed for high-resolution combined linkage and association mapping of quantitative trait loci (QTL), based on combinations of pedigree and population data. Suppose that a quantitative trait locus is located in a chromosome region that exerts pleiotropic effects on multiple quantitative traits. In the region, multiple markers such as single nucleotide polymorphisms are typed. Two regression models, "genotype effect model" and "additive effect model", are proposed to model the association between the markers and the trait locus. The linkage information, i.e., recombination fractions between the QTL and the markers, is modeled in the variance and covariance matrix. By analytical formulae, we show that the "genotype effect model" can be used to model the additive and dominant effects simultaneously; the "additive effect model" only takes care of additive effect. Based on the two models, F-test statistics are proposed to test association between the QTL and markers. By analytical power analysis, we show that bivariate models can be more powerful than univariate models. For moderate-sized samples, the proposed models lead to correct type I error rates; and so the models are reasonably robust. As a practical example, the method is applied to analyze the genetic inheritance of rheumatoid arthritis for the data of The North American Rheumatoid Arthritis Consortium, Problem 2, Genetic Analysis Workshop 15, which confirms the advantage of the proposed bivariate models.  相似文献   

5.
Case-control study has been and continues to be one of the most popular designs in epidemiology. More recently, this design has been adopted to test for candidate genes when searching for disease genetic etiology. In this report, we present a multipoint linkage disequilibrium (LD) mapping approach with the focus on estimating the location of the target trait locus. It builds upon a representation, which shows that the difference between a case and a control in probabilities of carrying the target allele of a marker is proportional to that of the trait locus and that the proportionality factor is simply a measure of LD between the trait locus and the marker. Our method has the desired properties that (1) there is no need to specify phases of genotypic data with multiple markers, (2) it provides an estimate of location of the disease locus along with sampling uncertainty to help investigators to narrow chromosomal regions, and (3) a single test statistic is provided to test for LD in the framed region rather than testing the hypothesis one marker at a time. Our simulation work suggests that the proposed method performs well in terms of bias and coverage probability. Extension of the proposed method to account for confounding and genetic heterogeneity is discussed. We apply the proposed method to a published case-control data set for cystic fibrosis.  相似文献   

6.
Linkage disequilibrium (LD) or association studies using case-parent trios have become a common approach to locate unobserved susceptibility genes underlying complex diseases. With the availability of ever more dense marker maps, how to utilize the information carried by multiple markers simultaneously remains challenging. Recently, Liang et al. ([2001a] Am. J. Hum. Genet. 68: 937-950) proposed a multipoint LD method to estimate the location of a susceptibility gene within a framework map along with its sampling uncertainty. Two important features of this method are that 1) it uses all trios whether parents are heterozygous for a given marker or not, and 2) it provides a single test statistic for the null hypothesis of no linkage or no LD to the region, avoiding the multiple testing problem encountered when performing individual transmission disequilibrium tests (TDT) for each marker individually. In this paper, we discuss how this method can be expanded to address important issues pertaining to complex diseases in a unified fashion. These issues include, among others, gene-gene and gene-environment interactions, genetic heterogeneity, phenotypic refinement, and paternal vs. maternal transmission. We applied this method to asthmatic case-parent trios from the Collaborative Study on the Genetics of Asthma (CSGA), and found that the previous evidence for linkage and LD in a 13.6 cM region of chromosome 11 can be attributed to maternal transmission, while there was no evidence of excess paternal transmission. Furthermore, such discrepancy in preferential transmission was most evident among probands with early onset age (6 years old or younger).  相似文献   

7.
Knowledge of the extent and distribution of linkage disequilibrium (LD) is critical to the design and interpretation of gene mapping studies. Because the demographic history of each population varies and is often not accurately known, it is necessary to empirically evaluate LD on a population‐specific basis. Here we present the first genome‐wide survey of LD in the Old Order Amish (OOA) of Lancaster County Pennsylvania, a closed population derived from a modest number of founders. Specifically, we present a comparison of LD between OOA individuals and US Utah participants in the International HapMap project (abbreviated CEU) using a high‐density single nucleotide polymorphism (SNP) map. Overall, the allele (and haplotype) frequency distributions and LD profiles were remarkably similar between these two populations. For example, the median absolute allele frequency difference for autosomal SNPs was 0.05, with an inter‐quartile range of 0.02–0.09, and for autosomal SNPs 10–20 kb apart with common alleles (minor allele frequency≥0.05), the LD measure r2 was at least 0.8 for 15 and 14% of SNP pairs in the OOA and CEU, respectively. Moreover, tag SNPs selected from the HapMap CEU sample captured a substantial portion of the common variation in the OOA (~88%) at r2≥0.8. These results suggest that the OOA and CEU may share similar LD profiles for other common but untyped SNPs. Thus, in the context of the common variant‐common disease hypothesis, genetic variants discovered in gene mapping studies in the OOA may generalize to other populations. Genet. Epidemiol. 34: 146–150, 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

8.
The probabilities that two individuals share 0, 1, or 2 alleles identical by descent (IBD) at a given genotyped marker locus are quantities of fundamental importance for disease gene and quantitative trait mapping and in family-based tests of association. Until recently, genotyped markers were sufficiently sparse that founder haplotypes could be modelled as having been drawn from a population in linkage equilibrium for the purpose of estimating IBD probabilities. However, with the advent of high-throughput single nucleotide polymorphism genotyping assays, this is no longer a reasonable assumption. Indeed, the imminent arrival of individual sequencing will enable high-density single nucleotide polymorphism genotyping on a scale for which current algorithms are not equipped. In this paper, we present a simple new model in which founder haplotypes are modelled as a Markov chain. Another important innovation is that genotyping errors are explicitly incorporated into the model. We compare results obtained using the new model to those obtained using the popular genetic linkage analysis package Merlin, with and without using the cluster model of linkage disequilibrium that is incorporated into that program. We find that the new model results in accuracy approaching that of Merlin with haplotype blocks, but achieves this with orders of magnitude faster run times. Moreover, the new algorithm scales linearly with number of markers, irrespective of density, whereas Merlin scales supralinearly. We also confirm a previous finding that ignoring linkage disequilibrium in founder haplotypes can cause errors in the calculation of IBD probabilities.  相似文献   

9.
The composite linkage disequilibrium (LD) measure is often calculated for two-locus genotypic data, especially when coupling and repulsion double heterozygotes cannot be distinguished. This measure was reported to have good statistical properties and was suggested for routine testing of LD, regardless of Hardy-Weinberg equilibrium at either of two loci. However, the bounds for this measure have not been yet reported. These bounds are derived here as functions of one-locus genotype or allele frequencies. They provide standardized measures of composite linkage disequilibrium, defined as the proportion of its maximum attainable value, given observed allele or genotype frequencies.  相似文献   

10.
Despite the numerous and successful applications of genome-wide association studies (GWASs), there has been a lot of difficulty in discovering disease susceptibility loci (DSLs). This is due to the fact that the GWAS approach is an indirect mapping technique, often identifying markers. For the identification of DSLs, which is required for the understanding of the genetic pathways for complex diseases, sequencing data that examines every genetic locus directly is necessary. Yet, there is currently a lack of methodology targeted at the identification of the DSLs in sequencing data: existing methods localize the causal variant to a region but not to a single variant, and therefore do not allow one to identify unique loci that cause the phenotype association. Here, we have developed such a method to determine if there is evidence that an individual loci affects case/control status with sequencing data. This methodology differs from other rare variant approaches: rather than testing an entire region comprised of many loci for association with the phenotype, we can identify the individual genetic locus that causes the association between the phenotype and the genetic region. For each variant, the test determines if the pattern of linkage disequilibrium (LD) across the other variants coincides with the pattern expected if that variant were a DSL. Power simulations show that the method successfully detects the causal variant, distinguishing it from other nearby variants (in high LD with the causal variant), and outperforms the standard tests. The efficiency of the method is especially apparent with small samples, which are currently realistic for studies due to sequencing data costs. The practical relevance of the approach is illustrated by an application to a sequencing dataset for nonsyndromic cleft lip with or without cleft palate. The proposed method implicated one variant (P = 0.002, 0.062 after Bonferroni correction), which was not found by standard analyses. Code for implementation is available.  相似文献   

11.
Rapid development in biotechnology has enhanced the opportunity to deal with multipoint gene mapping for complex diseases, and association studies using quantitative traits have recently generated much attention. Unlike the conventional hypothesis-testing approach for fine mapping, we propose a unified multipoint method to localize a gene controlling a quantitative trait. We first calculate the sample size needed to detect linkage and linkage disequilibrium (LD) for a quantitative trait, categorized by decile, under three different modes of inheritance. Our results show that sampling trios of offspring and their parents from either extremely low (EL) or extremely high (EH) probands provides greater statistical power than sampling in the intermediate range. We next propose a unified sampling approach for multipoint LD mapping, where the goal is to estimate the map position (tau) of a trait locus and to calculate a confidence interval along with its sampling uncertainty. Our method builds upon a model for an expected preferential transmission statistic at an arbitrary locus conditional on the sampling scheme, such as sampling from EL and EH probands. This approach is valid regardless of the underlying genetic model. The one major assumption for this model is that no more than one quantitative trait locus (QTL) is linked to the region being mapped. Finally we illustrate the proposed method using family data on total serum IgE levels collected in multiplex asthmatic families from Barbados. An unobserved QTL appears to be located at tau; = 41.93 cM with 95% confidence interval of (40.84, 43.02) through the 20-cM region framed by markers D12S1052 and D12S1064 on chromosome 12. The test statistic shows strong evidence of linkage and LD (chi-square statistic = 18.39 with 2 df, P-value = 0.0001).  相似文献   

12.
Weir BL 《Genetic epidemiology》2001,21(Z1):S415-S420
A range of study designs, using unrelated or family controls, were used to investigate the pattern of association with disease of single nucleotide polymorphisms (SNPs) within candidate gene 1 (simulated data). Strong evidence of disease association at the functional locus was detected using all study designs, and in the "general" but not the "isolated" population the functional polymorphism displayed considerably higher association than surrounding SNPs. There was much variation in the strength of association of SNPs with disease, up to 70% of which was explained by SNP allele frequency and distance from the functional polymorphism. Some common polymorphisms very close to the functional locus however showed no association with disease. Analysis of short haplotypes of SNPs reduced but did not totally remove this feature.  相似文献   

13.
Promising findings from genetic association studies are commonly presented with two distinct figures: one gives the association study results and the other indicates linkage disequilibrium (LD) between genetic markers in the region(s) of interest. Fully interpreting the results of such studies requires synthesizing the information in these figures, which is generally done in a subjective and unsystematic manner. Here we present a method to formally combine association results and LD and display them in the same figure; we have developed a freely available web‐based application that can be used to generate figures to display the combined data. To demonstrate this approach we apply it to fine mapping data from the prostate cancer 8q24 loci. Combining these two sources of information in a single figure allows one to more clearly assess patterns of association, facilitating the interpretation of genome‐wide and fine mapping data and improving our ability to localize causal variants. Genet. Epidemiol. 33:599–603, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

14.
The corporation of a linkage disequilibrium parameter, delta, into linkage analysis is illustrated for data from Genetic Analysis Workshop II. Points from a joint likelihood surface are calculated and displayed on a recombination fraction-linkage disequilibrium grid using a simple modification of LIPED. The approach is shown to increase the power of linkage analysis and the power of tests for heterogeneity of linkage for the simulated examples.  相似文献   

15.
目的 检测梅州地区缺血性脑卒中患者和健康对照人群TLR4基因的(rs10759932、rs11536879、rs11536891、rs1927914)的基因多态性,经连锁不平衡分析其与缺血性脑卒中的相关性。方法 收集2018年1月1日 - 2018年7月31日住院治疗的突发缺血性脑卒中患者作为病例组,同期在体检中心收集健康人群作为对照组。应用Massarray SNP 分型技术检测两组患者TLR4基因的4个位点基因型,进行Hardy - Weinberg(H - W)平衡检测。采用不同模型分析上述位点不同基因型与脑梗死发病风险的相关性,并通过连锁不平衡分析其与缺血性脑卒中的相关性。结果 病例组纳入病例186名,对照组纳入健康人194名;4种SNP位点均符合H - W 平衡。rs1927914位点G/G基因型在对照组出现频率远远高于病例组(χ2 = 9.267,P<0.05)。rs10759932位点T/T基因型在女性对照组中出现的频率显著高于男性[OR = 0.38 (0.18 - 0.81),P<0.05]。4个SNP位点之间均存在连锁不平衡,TGCG基因型组合在缺血性卒中男性患者出现的频率显著高于女性[OR = 3.54 (1.17 - 10.69),P<0.05]。结论 梅州地区rs1927914位点A>G为保护性基因突变,可以降低缺血性脑卒中发生。4个位点的连锁不平衡与缺血性脑卒中发生存在部分性别差异,TGCG组合为男性人群的危险基因,脑卒中发生率显著升高。  相似文献   

16.
Though multiple interacting loci are likely involved in the etiology of complex diseases, early genome-wide association studies (GWAS) have depended on the detection of the marginal effects of each locus. Here, we evaluate the power of GWAS in the presence of two linked and potentially associated causal loci for several models of interaction between them and find that interacting loci may give rise to marginal relative risks that are not generally considered in a one-locus model. To derive power under realistic situations, we use empirical data generated by the HapMap ENCODE project for both allele frequencies and linkage disequilibrium (LD) structure. The power is also evaluated in situations where the causal single nucleotide polymorphisms (SNPs) may not be genotyped, but rather detected by proxy using a SNP in LD. A common simplification for such power computations assumes that the sample size necessary to detect the effect at the tSNP is the sample size necessary to detect the causal locus directly divided by the LD measure r(2) between the two. This assumption, which we call the "proportionality assumption", is a simplification of the many factors that contribute to the strength of association at a marker, and has recently been criticized as unreasonable (Terwilliger and Hiekkalinna [2006] Eur J Hum Genet 14(4):426-437), in particular in the presence of interacting and associated loci. We find that this assumption does not introduce much error in single locus models of disease, but may do so in so in certain two-locus models.  相似文献   

17.
We quantify the degree to which LD differences exist in the human genome and investigates the consequences that variations in patterns of LD between populations can have on the power of case-control or family-trio association studies. Although only a small proportion of SNPs show significant LD differences (0.8-5%), these can introduce artificial signals of associations and reduce the power to detect true associations in case-control designs, even when meta-analytic approaches are used to account for stratification. We show that combining trios from different populations in the presence of significant LD differences can adversely affect power even though the number of trios has increased. Our results have implications on genetic studies conducted in populations with substantial population structure and show that the use of meta-analytic approaches or family-based designs to protect Type 1 error does not prevent loss of power due to differences in LD across populations.  相似文献   

18.
Linkage disequilibrium (LD) of genetic loci is routinely estimated and graphically illustrated in genetic association studies. It has been suggested that the information in LD is also useful for association mapping and genetic association can be detected by comparing LD patterns between cases and controls. Here, we extend this idea to analyze case‐parents data by comparing LD patterns between transmitted and nontransmitted genotypes. We provide the condition when contrasting LD is valid for testing gene‐gene interactions. A permutation procedure is given to assess statistical significance. One advantage of our proposed methods is that haplotype information is not required. Thus, the implementation of our methods is straightforward and the resulted tests are free from potential bias caused by assumptions made to estimate haplotypes in silico. Since our test statistics use pairwise LD measurements, they are less affected by missing data than many other multilocus methods. With simulated data, we demonstrate that examining LD patterns of case‐parents data is a useful multilocus association mapping strategy and it complements existing association mapping methods. The application of our methods to a Crohn's disease data set shows that our methods can detect multilocus association that might be missed by other association methods. Our permutation procedure can also be modified to allow multiple offspring from a family to be analyzed. Genet. Epidemiol. 2011. © 2011 Wiley‐Liss, Inc. 35: 487‐498, 2011  相似文献   

19.
Most disease association mapping algorithms are based on hypothesis testing procedures that test one variant at a time. Those methods lose power when the disease mutations are jointly tagged by multiple variants, or when gene-gene interaction exist. Nearby variants are also correlated, for which procedures ignoring the dependence between variants will inevitably produce redundant results. With a large number of variants genotyped in current genome-wide disease association studies, simultaneous multivariant association mapping algorithms are strongly desired. We present a novel Bayesian method for automatic detection of multivariant joint association in genome-wide case-control studies. Our method has improved power and specificity over existing tools. We fit a joint probabilistic model to the entire data and identify disease variants simultaneously. The method dynamically accounts for the strong linkage disequilibrium (LD) between variants. As a result, only the primary disease variants will be identified, with all secondary associations due to LD effects filtered out. Our method better pinpoints the disease variants with improved resolution. The method is also computationally efficient for genome-wide studies. When applied to a real data set of inflammatory bowel disease (IBD) containing 401,473 variants in 4,720 individuals, our method detected all previously reported IBD loci in the same data, and recovered two missed loci. We further detected two novel interchromosome interactions. The first is between STAT3 and PARD6G, and the second is between DLG5 and an intergenic region at 5p14. We further validated the two interactions in an independent study.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号