首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Genome‐wide association studies (GWAS) have led to the discovery of over 200 single nucleotide polymorphisms (SNPs) associated with type 2 diabetes mellitus (T2DM). Additionally, East Asians develop T2DM at a higher rate, younger age, and lower body mass index than their European ancestry counterparts. The reason behind this occurrence remains elusive. With comprehensive searches through the National Human Genome Research Institute (NHGRI) GWAS catalog literature, we compiled a database of 2,800 ancestry‐specific SNPs associated with T2DM and 70 other related traits. Manual data extraction was necessary because the GWAS catalog reports statistics such as odds ratio and P‐value, but does not consistently include ancestry information. Currently, many statistics are derived by combining initial and replication samples from study populations of mixed ancestry. Analysis of all‐inclusive data can be misleading, as not all SNPs are transferable across diverse populations. We used ancestry data to construct ancestry‐specific human phenotype networks (HPN) centered on T2DM. Quantitative and visual analysis of network models reveal the genetic disparities between ancestry groups. Of the 27 phenotypes in the East Asian HPN, six phenotypes were unique to the network, revealing the underlying ancestry‐specific nature of some SNPs associated with T2DM. We studied the relationship between T2DM and five phenotypes unique to the East Asian HPN to generate new interaction hypotheses in a clinical context. The genetic differences found in our ancestry‐specific HPNs suggest different pathways are involved in the pathogenesis of T2DM among different populations. Our study underlines the importance of ancestry in the development of T2DM and its implications in pharmocogenetics and personalized medicine.  相似文献   

2.
In genome‐wide association studies (GWAS) genetic loci that influence complex traits are localized by inspecting associations between genotypes of genetic markers and the values of the trait of interest. On the other hand, admixture mapping, which is performed in case of populations consisting of a recent mix of two ancestral groups, relies on the ancestry information at each locus (locus‐specific ancestry). Recently it has been proposed to jointly model genotype and locus‐specific ancestry within the framework of single marker tests. Here, we extend this approach for population‐based GWAS in the direction of multimarker models. A modified version of the Bayesian information criterion is developed for building a multilocus model that accounts for the differential correlation structure due to linkage disequilibrium (LD) and admixture LD. Simulation studies and a real data example illustrate the advantages of this new approach compared to single‐marker analysis or modern model selection strategies based on separately analyzing genotype and ancestry data, as well as to single‐marker analysis combining genotypic and ancestry information. Depending on the signal strength, our procedure automatically chooses whether genotypic or locus‐specific ancestry markers are added to the model. This results in a good compromise between the power to detect causal mutations and the precision of their localization. The proposed method has been implemented in R and is available at http://www.math.uni.wroc.pl/~mbogdan/admixtures/ .  相似文献   

3.
Genetic association studies in admixed populations allow us to gain deeper understanding of the genetic architecture of human diseases and traits. However, population stratification, complicated linkage disequilibrium (LD) patterns, and the complex interplay of allelic and ancestry effects on phenotypic traits pose challenges in such analyses. These issues may lead to detecting spurious associations and/or result in reduced statistical power. Fortunately, if handled appropriately, these same challenges provide unique opportunities for gene mapping. To address these challenges and to take these opportunities, we propose a robust and powerful two‐step testing procedure Local Ancestry Adjusted Allelic (LAAA) association. In the first step, LAAA robustly captures associations due to allelic effect, ancestry effect, and interaction effect, allowing detection of effect heterogeneity across ancestral populations. In the second step, LAAA identifies the source of association, namely allelic, ancestry, or the combination. By jointly modeling allele, local ancestry, and ancestry‐specific allelic effects, LAAA is highly powerful in capturing the presence of interaction between ancestry and allele effect. We evaluated the validity and statistical power of LAAA through simulations over a broad spectrum of scenarios. We further illustrated its usefulness by application to the Candidate Gene Association Resource (CARe) African American participants for association with hemoglobin levels. We were able to replicate independent groups’ previously identified loci that would have been missed in CARe without joint testing. Moreover, the loci, for which LAAA detected potential effect heterogeneity, were replicated among African Americans from the Women's Health Initiative study. LAAA is freely available at https://yunliweb.its.unc.edu/LAAA .  相似文献   

4.
Genome‐wide association studies have discovered and confirmed a large number of loci that are implicated with disease susceptibility and severity. Polymorphisms that emerged from these studies are mostly indirectly associated to the phenotype, and the natural progression is to identify the causal variants that are functionally responsible for these association signals. Long stretches of high linkage disequilibrium (LD) benefitted the initial discovery phase in a genome‐wide scan, allowing commercial genotyping products with imperfect coverage to detect genomic regions genuinely associated with the phenotype. However, regions of high LD confound the fine‐mapping phase, as markers that are perfectly correlated to the causal variants display similar evidence of phenotypic association, hampering the process of differentiating the functional polymorphisms from neighboring surrogates. Here, we explore the potential of integrating information across different populations for narrowing the candidate region that a causal variant resides in, and compare the efficacy of this process of trans‐population fine‐mapping with the extent of variation in patterns of LD between the populations. In addition, we explore two different strategies for pooling data across multiple populations for the purpose of prioritizing the rankings of the causal variants. Our results clearly establish the benefits of trans‐population analysis in reducing the number of possible candidates for the causal variants, particularly in genomic regions displaying strong evidence of inter‐population LD variation. Directly integrating the statistical evidence by summing the test statistics outperforms the standard meta‐analytic procedure. These findings have direct relevance to the design and analysis of ongoing fine‐mapping studies. Genet. Epidemiol. 34: 653‐664, 2010.© 2010 Wiley‐Liss, Inc.  相似文献   

5.
Populations of non-European ancestry are substantially underrepresented in genome-wide association studies (GWAS). As genetic effects can differ between ancestries due to possibly different causal variants or linkage disequilibrium patterns, a meta-analysis that includes GWAS of all populations yields biased estimation in each of the populations and the bias disproportionately impacts non-European ancestry populations. This is because meta-analysis combines study-specific estimates with inverse variance as the weights, which causes biases towards studies with the largest sample size, typical of the European ancestry population. In this paper, we propose two empirical Bayes (EB) estimators to borrow the strength of information across populations although accounting for between-population heterogeneity. Extensive simulation studies show that the proposed EB estimators are largely unbiased and improve efficiency compared to the population-specific estimator. In contrast, even though the meta-analysis estimator has a much smaller variance, it yields significant bias when the genetic effect is heterogeneous across populations. We apply the proposed EB estimators to a large-scale trans-ancestry GWAS of stroke and demonstrate that the EB estimators reduce the variance of the population-specific estimator substantially, with the effect estimates close to the population-specific estimates.  相似文献   

6.
7.
Genome‐wide association studies (GWAS) of complex traits have generated many association signals for single nucleotide polymorphisms (SNPs). To understand the underlying causal genetic variant(s), focused DNA resequencing of targeted genomic regions is commonly used, yet the current cost of resequencing limits sample sizes for resequencing studies. Information from the large GWAS can be used to guide choice of samples for resequencing, such as the SNP genotypes in the targeted genomic region. Viewing the GWAS tag‐SNPs as imperfect surrogates for the underlying causal variants, yet expecting that the tag‐SNPs are correlated with the causal variants, a reasonable approach is a two‐phase case‐control design, with the GWAS serving as the first‐phase and the resequencing study serving as the second‐phase. Using stratified sampling based on both tag‐SNP genotypes and case‐control status, we explore the gains in power of a two‐phase design relative to randomly sampling cases and controls for resequencing (i.e., ignoring tag‐SNP genotypes). Simulation results show that stratified sampling based on both tag‐SNP genotypes and case‐control status is not likely to have lower power than stratified sampling based only on case‐control status, and can sometimes have substantially greater power. The gain in power depends on the amount of linkage disequilibrium between the tag‐SNP and causal variant alleles, as well as the effect size of the causal variant. Hence, the two‐phase design provides an efficient approach to follow‐up GWAS signals with DNA resequencing.  相似文献   

8.
Current genome-wide association studies (GWAS) often involve populations that have experienced recent genetic admixture. Genotype data generated from these studies can be used to test for association directly, as in a non-admixed population. As an alternative, these data can be used to infer chromosomal ancestry, and thus allow for admixture mapping. We quantify the contribution of allele-based and ancestry-based association testing under a family-design, and demonstrate that the two tests can provide non-redundant information. We propose a joint testing procedure, which efficiently integrates the two sources information. The efficiencies of the allele, ancestry and combined tests are compared in the context of a GWAS. We discuss the impact of population history and provide guidelines for future design and analysis of GWAS in admixed populations.  相似文献   

9.
Genome‐wide association studies (GWAS) of common disease have been hugely successful in implicating loci that modify disease risk. The bulk of these associations have proven robust and reproducible, in part due to community adoption of statistical criteria for claiming significant genotype‐phenotype associations. As the cost of sequencing continues to drop, assembling large samples in global populations is becoming increasingly feasible. Sequencing studies interrogate not only common variants, as was true for genotyping‐based GWAS, but variation across the full allele frequency spectrum, yielding many more (independent) statistical tests. We sought to empirically determine genome‐wide significance thresholds for various analysis scenarios. Using whole‐genome sequence data, we simulated sequencing‐based disease studies of varying sample size and ancestry. We determined that future sequencing efforts in >2,000 samples of European, Asian, or admixed ancestry should set genome‐wide significance at approximately P = 5 × 10?9, and studies of African samples should apply a more stringent genome‐wide significance threshold of P = 1 × 10?9. Adoption of a revised multiple test correction will be crucial in avoiding irreproducible claims of association.  相似文献   

10.
Meta‐analysis of genome‐wide association studies (GWAS) has achieved great success in detecting loci underlying human diseases. Incorporating GWAS results from diverse ethnic populations for meta‐analysis, however, remains challenging because of the possible heterogeneity across studies. Conventional fixed‐effects (FE) or random‐effects (RE) methods may not be most suitable to aggregate multiethnic GWAS results because of violation of the homogeneous effect assumption across studies (FE) or low power to detect signals (RE). Three recently proposed methods, modified RE (RE‐HE) model, binary‐effects (BE) model and a Bayesian approach (Meta‐analysis of Transethnic Association [MANTRA]), show increased power over FE and RE methods while incorporating heterogeneity of effects when meta‐analyzing trans‐ethnic GWAS results. We propose a two‐stage approach to account for heterogeneity in trans‐ethnic meta‐analysis in which we clustered studies with cohort‐specific ancestry information prior to meta‐analysis. We compare this to a no‐prior‐clustering (crude) approach, evaluating type I error and power of these two strategies, in an extensive simulation study to investigate whether the two‐stage approach offers any improvements over the crude approach. We find that the two‐stage approach and the crude approach for all five methods (FE, RE, RE‐HE, BE, MANTRA) provide well‐controlled type I error. However, the two‐stage approach shows increased power for BE and RE‐HE, and similar power for MANTRA and FE compared to their corresponding crude approach, especially when there is heterogeneity across the multiethnic GWAS results. These results suggest that prior clustering in the two‐stage approach can be an effective and efficient intermediate step in meta‐analysis to account for the multiethnic heterogeneity.  相似文献   

11.
Genetic association studies in admixed populations may be biased if individual ancestry varies within the population and the phenotype of interest is associated with ancestry. However, recently admixed populations also offer potential benefits in association studies since markers informative for ancestry may be in linkage disequilibrium across large distances. In particular, the enhanced LD in admixed populations may be used to identify alleles that underlie a genetically determined difference in a phenotype between two ancestral populations. Asthma is known to have different prevalence and severity among ancestrally distinct populations. We investigated several asthma-related phenotypes in two ancestrally admixed populations: Mexican Americans and Puerto Ricans. We used ancestry informative markers to estimate the individual ancestry of 181 Mexican American asthmatics and 181 Puerto Rican asthmatics and tested whether individual ancestry is associated with any of these phenotypes independently of known environmental factors. We found an association between higher European ancestry and more severe asthma as measured by both forced expiratory volume at 1 second (r=-0.21, p=0.005) and by a clinical assessment of severity among Mexican Americans (OR: 1.55; 95% CI 1.25 to 1.93). We found no significant associations between ancestry and severity or drug responsiveness among Puerto Ricans. These results suggest that asthma severity may be influenced by genetic factors differentiating Europeans and Native Americans in Mexican Americans, although differing results for Puerto Ricans require further investigation.  相似文献   

12.
Genome‐wide association studies (GWAS) have identified many single nucleotide polymorphisms (SNPs) associated with complex traits. However, the genetic heritability of most of these traits remains unexplained. To help guide future studies, we address the crucial question of whether future GWAS can detect new SNP associations and explain additional heritability given the new availability of larger GWAS SNP arrays, imputation, and reduced genotyping costs. We first describe the pairwise and imputation coverage of all SNPs in the human genome by commercially available GWAS SNP arrays, using the 1000 Genomes Project as a reference. Next, we describe the findings from 6 years of GWAS of 172 chronic diseases, calculating the power to detect each of them while taking array coverage and sample size into account. We then calculate the power to detect these SNP associations under different conditions using improved coverage and/or sample sizes. Finally, we estimate the percentages of SNP associations and heritability previously detected and detectable by future GWAS under each condition. Overall, we estimated that previous GWAS have detected less than one‐fifth of all GWAS‐detectable SNPs underlying chronic disease. Furthermore, increasing sample size has a much larger impact than increasing coverage on the potential of future GWAS to detect additional SNP‐disease associations and heritability.  相似文献   

13.
Recently, large scale genome‐wide association study (GWAS) meta‐analyses have boosted the number of known signals for some traits into the tens and hundreds. Typically, however, variants are only analysed one‐at‐a‐time. This complicates the ability of fine‐mapping to identify a small set of SNPs for further functional follow‐up. We describe a new and scalable algorithm, joint analysis of marginal summary statistics (JAM), for the re‐analysis of published marginal summary stactistics under joint multi‐SNP models. The correlation is accounted for according to estimates from a reference dataset, and models and SNPs that best explain the complete joint pattern of marginal effects are highlighted via an integrated Bayesian penalized regression framework. We provide both enumerated and Reversible Jump MCMC implementations of JAM and present some comparisons of performance. In a series of realistic simulation studies, JAM demonstrated identical performance to various alternatives designed for single region settings. In multi‐region settings, where the only multivariate alternative involves stepwise selection, JAM offered greater power and specificity. We also present an application to real published results from MAGIC (meta‐analysis of glucose and insulin related traits consortium) – a GWAS meta‐analysis of more than 15,000 people. We re‐analysed several genomic regions that produced multiple significant signals with glucose levels 2 hr after oral stimulation. Through joint multivariate modelling, JAM was able to formally rule out many SNPs, and for one gene, ADCY5, suggests that an additional SNP, which transpired to be more biologically plausible, should be followed up with equal priority to the reported index.  相似文献   

14.
We quantify the degree to which LD differences exist in the human genome and investigates the consequences that variations in patterns of LD between populations can have on the power of case-control or family-trio association studies. Although only a small proportion of SNPs show significant LD differences (0.8-5%), these can introduce artificial signals of associations and reduce the power to detect true associations in case-control designs, even when meta-analytic approaches are used to account for stratification. We show that combining trios from different populations in the presence of significant LD differences can adversely affect power even though the number of trios has increased. Our results have implications on genetic studies conducted in populations with substantial population structure and show that the use of meta-analytic approaches or family-based designs to protect Type 1 error does not prevent loss of power due to differences in LD across populations.  相似文献   

15.
Polygenic prediction using genome‐wide SNPs can provide high prediction accuracy for complex traits. Here, we investigate the question of how to account for genetic ancestry when conducting polygenic prediction. We show that the accuracy of polygenic prediction in structured populations may be partly due to genetic ancestry. However, we hypothesized that explicitly modeling ancestry could improve polygenic prediction accuracy. We analyzed three GWAS of hair color (HC), tanning ability (TA), and basal cell carcinoma (BCC) in European Americans (sample size from 7,440 to 9,822) and considered two widely used polygenic prediction approaches: polygenic risk scores (PRSs) and best linear unbiased prediction (BLUP). We compared polygenic prediction without correction for ancestry to polygenic prediction with ancestry as a separate component in the model. In 10‐fold cross‐validation using the PRS approach, the R2 for HC increased by 66% (0.0456–0.0755; P < 10−16), the R2 for TA increased by 123% (0.0154 to 0.0344; P < 10−16), and the liability‐scale R2 for BCC increased by 68% (0.0138–0.0232; P < 10−16) when explicitly modeling ancestry, which prevents ancestry effects from entering into each SNP effect and being overweighted. Surprisingly, explicitly modeling ancestry produces a similar improvement when using the BLUP approach, which fits all SNPs simultaneously in a single variance component and causes ancestry to be underweighted. We validate our findings via simulations, which show that the differences in prediction accuracy will increase in magnitude as sample sizes increase. In summary, our results show that explicitly modeling ancestry can be important in both PRS and BLUP prediction.  相似文献   

16.
Founder or isolated populations have advantages for genetic studies due to decreased genetic and environmental heterogeneity. However, whereas longer‐range linkage disequilibrium (LD) in these populations is expected to facilitate gene localization, extensive LD may actually limit the ability for gene discovery. The North American Hutterite population is one of the best characterized young founder populations and members of this isolate have been the subjects of our studies of complex traits, including fertility, asthma and cardiovascular disease, for >20 years. Here, we directly assess the patterns and extent of global LD using single nucleotide polymorphism genotypes with minor allele frequencies (MAFs) ≥5% from the Affymetrix GeneChip® Mapping 500 K array in 60 relatively unrelated Hutterites and 60 unrelated Europeans (HapMap CEU). Although LD among some marker pairs extends further in the Hutterites than in Europeans, the pattern of LD and MAF are surprisingly similar. These results indicate that (1) identifying disease genes should be no more difficult in the Hutterites than in outbred European populations, (2) the same common susceptibility alleles for complex diseases should be present in the Hutterites and outbred European populations, and (3) imputation algorithms based on HapMap CEU should be applicable to the Hutterites. Genet. Epidemiol. 34: 133–139, 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

17.
Genome-wide association studies (GWAS) are a powerful tool for understanding the genetic basis of diseases and traits, but most studies have been conducted in isolation, with a focus on either a single or a set of closely related phenotypes. We describe MetABF, a simple Bayesian framework for performing integrative meta-analysis across multiple GWAS using summary statistics. The approach is applicable across a wide range of study designs and can increase the power by 50% compared with standard frequentist tests when only a subset of studies have a true effect. We demonstrate its utility in a meta-analysis of 20 diverse GWAS which were part of the Wellcome Trust Case Control Consortium 2. The novelty of the approach is its ability to explore, and assess the evidence for a range of possible true patterns of association across studies in a computationally efficient framework.  相似文献   

18.
Population substructure can lead to confounding in tests for genetic association, and failure to adjust properly can result in spurious findings. Here we address this issue of confounding by considering the impact of global ancestry (average ancestry across the genome) and local ancestry (ancestry at a specific chromosomal location) on regression parameters and relative power in ancestry‐adjusted and ‐unadjusted models. We examine theoretical expectations under different scenarios for population substructure; applying different regression models, verifying and generalizing using simulations, and exploring the findings in real‐world admixed populations. We show that admixture does not lead to confounding when the trait locus is tested directly in a single admixed population. However, if there is more complex population structure or a marker locus in linkage disequilibrium (LD) with the trait locus is tested, both global and local ancestry can be confounders. Additionally, we show the genotype parameters of adjusted and unadjusted models all provide tests for LD between the marker and trait locus, but in different contexts. The local ancestry adjusted model tests for LD in the ancestral populations, while tests using the unadjusted and the global ancestry adjusted models depend on LD in the admixed population(s), which may be enriched due to different ancestral allele frequencies. Practically, this implies that global‐ancestry adjustment should be used for screening, but local‐ancestry adjustment may better inform fine mapping and provide better effect estimates at trait loci.  相似文献   

19.
Genome‐wide association studies (GWAS) have been successful in finding numerous new risk variants for complex diseases, but the results almost exclusively rely on single‐marker scans. Methods that can analyze joint effects of many variants in GWAS data are still being developed and trialed. To evaluate the performance of such methods it is essential to have a GWAS data simulator that can rapidly simulate a large number of samples, and capture key features of real GWAS data such as linkage disequilibrium (LD) among single‐nucleotide polymorphisms (SNPs) and joint effects of multiple loci (multilocus epistasis). In the current study, we combine techniques for specifying high‐order epistasis among risk SNPs with an existing program GWAsimulator [Li and Li, 2008] to achieve rapid whole‐genome simulation with accurate modeling of complex interactions. We considered various approaches to specifying interaction models including the following: departure from product of marginal effects for pairwise interactions, product terms in logistic regression models for low‐order interactions, and penetrance tables conforming to marginal effect constraints for high‐order interactions or prescribing known biological interactions. Methods for conversion among different model specifications are developed using penetrance table as the fundamental characterization of disease models. The new program, called simGWA, is capable of efficiently generating large samples of GWAS data with high precision. We show that data simulated by simGWA are faithful to template LD structures, and conform to prespecified diseases models with (or without) interactions.  相似文献   

20.
We evaluate two‐phase designs to follow‐up findings from genome‐wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation‐maximization‐based inference under a semiparametric maximum likelihood formulation tailored for post‐GWAS inference. A GWAS‐SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT). We assess test validity and quantify efficiency and power of joint QT‐SNP‐dependent sampling and analysis under alternative sample allocations by simulations. Joint allocation balanced on SNP genotype and extreme‐QT strata yields significant power improvements compared to marginal QT‐ or SNP‐based allocations. We illustrate the proposed method and evaluate the sensitivity of sample allocation to sampling variation using data from a sequencing study of systolic blood pressure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号