首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
In genome‐wide association studies (GWAS) genetic loci that influence complex traits are localized by inspecting associations between genotypes of genetic markers and the values of the trait of interest. On the other hand, admixture mapping, which is performed in case of populations consisting of a recent mix of two ancestral groups, relies on the ancestry information at each locus (locus‐specific ancestry). Recently it has been proposed to jointly model genotype and locus‐specific ancestry within the framework of single marker tests. Here, we extend this approach for population‐based GWAS in the direction of multimarker models. A modified version of the Bayesian information criterion is developed for building a multilocus model that accounts for the differential correlation structure due to linkage disequilibrium (LD) and admixture LD. Simulation studies and a real data example illustrate the advantages of this new approach compared to single‐marker analysis or modern model selection strategies based on separately analyzing genotype and ancestry data, as well as to single‐marker analysis combining genotypic and ancestry information. Depending on the signal strength, our procedure automatically chooses whether genotypic or locus‐specific ancestry markers are added to the model. This results in a good compromise between the power to detect causal mutations and the precision of their localization. The proposed method has been implemented in R and is available at http://www.math.uni.wroc.pl/~mbogdan/admixtures/ .  相似文献   

2.
Populations of non-European ancestry are substantially underrepresented in genome-wide association studies (GWAS). As genetic effects can differ between ancestries due to possibly different causal variants or linkage disequilibrium patterns, a meta-analysis that includes GWAS of all populations yields biased estimation in each of the populations and the bias disproportionately impacts non-European ancestry populations. This is because meta-analysis combines study-specific estimates with inverse variance as the weights, which causes biases towards studies with the largest sample size, typical of the European ancestry population. In this paper, we propose two empirical Bayes (EB) estimators to borrow the strength of information across populations although accounting for between-population heterogeneity. Extensive simulation studies show that the proposed EB estimators are largely unbiased and improve efficiency compared to the population-specific estimator. In contrast, even though the meta-analysis estimator has a much smaller variance, it yields significant bias when the genetic effect is heterogeneous across populations. We apply the proposed EB estimators to a large-scale trans-ancestry GWAS of stroke and demonstrate that the EB estimators reduce the variance of the population-specific estimator substantially, with the effect estimates close to the population-specific estimates.  相似文献   

3.
Population substructure can lead to confounding in tests for genetic association, and failure to adjust properly can result in spurious findings. Here we address this issue of confounding by considering the impact of global ancestry (average ancestry across the genome) and local ancestry (ancestry at a specific chromosomal location) on regression parameters and relative power in ancestry‐adjusted and ‐unadjusted models. We examine theoretical expectations under different scenarios for population substructure; applying different regression models, verifying and generalizing using simulations, and exploring the findings in real‐world admixed populations. We show that admixture does not lead to confounding when the trait locus is tested directly in a single admixed population. However, if there is more complex population structure or a marker locus in linkage disequilibrium (LD) with the trait locus is tested, both global and local ancestry can be confounders. Additionally, we show the genotype parameters of adjusted and unadjusted models all provide tests for LD between the marker and trait locus, but in different contexts. The local ancestry adjusted model tests for LD in the ancestral populations, while tests using the unadjusted and the global ancestry adjusted models depend on LD in the admixed population(s), which may be enriched due to different ancestral allele frequencies. Practically, this implies that global‐ancestry adjustment should be used for screening, but local‐ancestry adjustment may better inform fine mapping and provide better effect estimates at trait loci.  相似文献   

4.
Genome-wide association studies (GWAS) routinely apply principal component analysis (PCA) to infer population structure within a sample to correct for confounding due to ancestry. GWAS implementation of PCA uses tens of thousands of single-nucleotide polymorphisms (SNPs) to infer structure, despite the fact that only a small fraction of such SNPs provides useful information on ancestry. The identification of this reduced set of ancestry-informative markers (AIMs) from a GWAS has practical value; for example, researchers can genotype the AIM set to correct for potential confounding due to ancestry in follow-up studies that utilize custom SNP or sequencing technology. We propose a novel technique to identify AIMs from genome-wide SNP data using sparse PCA. The procedure uses penalized regression methods to identify those SNPs in a genome-wide panel that significantly contribute to the principal components while encouraging SNPs that provide negligible loadings to vanish from the analysis. We found that sparse PCA leads to negligible loss of ancestry information compared to traditional PCA analysis of genome-wide SNP data. We further demonstrate the value of sparse PCA for AIM selection using real data from the International HapMap Project and a genomewide study of inflammatory bowel disease. We have implemented our approach in open-source R software for public use.  相似文献   

5.
Proper control of confounding due to population stratification is crucial for valid analysis of case-control association studies. Fine matching of cases and controls based on genetic ancestry is an increasingly popular strategy to correct for such confounding, both in genome-wide association studies (GWASs) as well as studies that employ next-generation sequencing, where matching can be used when selecting a subset of participants from a GWAS for rare-variant analysis. Existing matching methods match on measures of genetic ancestry that combine multiple components of ancestry into a scalar quantity. However, we show that including nonconfounding ancestry components in a matching criterion can lead to inaccurate matches, and hence to an improper control of confounding. To resolve this issue, we propose a novel method that assigns cases and controls to matched strata based on the stratification score (Epstein et al. [2007] Am J Hum Genet 80:921-930), which is the probability of disease given genomic variables. Matching on the stratification score leads to more accurate matches because case participants are matched to control participants who have a similar risk of disease given ancestry information. We illustrate our matching method using the African-American arm of the GAIN GWAS of schizophrenia. In this study, we observe that confounding due to stratification can be resolved by our matching approach but not by other existing matching procedures. We also use simulated data to show our novel matching approach can provide a more appropriate correction for population stratification than existing matching approaches.  相似文献   

6.
The National Human Genome Research Institute's catalog of published genome‐wide association studies (GWAS) lists over 10,000 genetic variants collectively associated with over 800 human diseases or traits. Most of these GWAS have been conducted in European‐ancestry populations. Findings gleaned from these studies have led to identification of disease‐associated loci and biologic pathways involved in disease etiology. In multiple instances, these genomic findings have led to the development of novel medical therapies or evidence for prescribing a given drug as the appropriate treatment for a given individual beyond phenotypic appearances or socially defined constructs of race or ethnicity. Such findings have implications for populations throughout the globe and GWAS are increasingly being conducted in more diverse populations. A major challenge for investigators seeking to follow up genomic findings between diverse populations is discordant patterns of linkage disequilibrium (LD). We provide an overview of common measures of LD and opportunities for their use in novel methods designed to address challenges associated with following up GWAS conducted in European‐ancestry populations in African‐ancestry populations or, more generally, between populations with discordant LD patterns. We detail the strengths and weaknesses associated with different approaches. We also describe application of these strategies in follow‐up studies of populations with concordant LD patterns (replication) or discordant LD patterns (transferability) as well as fine‐mapping studies. We review application of these methods to a variety of traits and diseases.  相似文献   

7.
Recent studies suggest that rare variants play an important role in the etiology of many traits. Although a number of methods have been developed for genetic association analysis of rare variants, they all assume a relatively homogeneous population under study. Such an assumption may not be valid for samples collected from admixed populations such asAfricanAmericans andHispanicAmericans as there is a great extent of local variation in ancestry in these populations. To ensure valid and more powerful rare variant association tests performed in admixed populations, we have developed a local ancestry‐based weighted dosage test, which is able to take into account local ancestry of rare alleles, uncertainties in rare variant imputation when imputed data are included, and the direction of effect that rare variants exert on phenotypic outcome. We used simulated sequence data to show that our proposed test has controlled typeIerror rates, whereas naïve application of existing rare variants tests and tests that adjust for global ancestry lead to inflated type I error rates. We showed that our test has higher power than tests without proper adjustment of ancestry. We also applied the proposed method to a candidate gene study on low‐density lipoprotein cholesterol. Our results suggest that it is important to appropriately control for potential population stratification induced by local ancestry difference in the analysis of rare variants in admixed populations.  相似文献   

8.
Genome‐wide association studies (GWAS) have led to the discovery of over 200 single nucleotide polymorphisms (SNPs) associated with type 2 diabetes mellitus (T2DM). Additionally, East Asians develop T2DM at a higher rate, younger age, and lower body mass index than their European ancestry counterparts. The reason behind this occurrence remains elusive. With comprehensive searches through the National Human Genome Research Institute (NHGRI) GWAS catalog literature, we compiled a database of 2,800 ancestry‐specific SNPs associated with T2DM and 70 other related traits. Manual data extraction was necessary because the GWAS catalog reports statistics such as odds ratio and P‐value, but does not consistently include ancestry information. Currently, many statistics are derived by combining initial and replication samples from study populations of mixed ancestry. Analysis of all‐inclusive data can be misleading, as not all SNPs are transferable across diverse populations. We used ancestry data to construct ancestry‐specific human phenotype networks (HPN) centered on T2DM. Quantitative and visual analysis of network models reveal the genetic disparities between ancestry groups. Of the 27 phenotypes in the East Asian HPN, six phenotypes were unique to the network, revealing the underlying ancestry‐specific nature of some SNPs associated with T2DM. We studied the relationship between T2DM and five phenotypes unique to the East Asian HPN to generate new interaction hypotheses in a clinical context. The genetic differences found in our ancestry‐specific HPNs suggest different pathways are involved in the pathogenesis of T2DM among different populations. Our study underlines the importance of ancestry in the development of T2DM and its implications in pharmocogenetics and personalized medicine.  相似文献   

9.
Admixture mapping is potentially a powerful method for mapping genes for complex human diseases, when the disease frequency due to a particular disease-susceptible gene is different between founding populations of different ethnicity. The method tests for association of the allele ancestry with the disease. Since the markers used to define ancestral populations are not fully informative for the ancestry status, direct test of such association is not possible. In this report, we develop a unified hidden Markov model (HMM) framework for estimating the unobserved ancestry haplotypes across a chromosomal region based on marker haplotype or genotype data. The HMM efficiently utilizes all the marker data to infer the latent ancestry states at the putative disease locus. In this HMM modelling framework, we develop a likelihood test for association of allele ancestry and the disease risk based on case-control data. Existence of such association may imply linkage between the candidate locus and the disease locus. We evaluate by simulations how several factors affect the power of admixture mapping, including sample size, ethnicity relative risk, marker density, and the different admixture dynamics. Our simulation results indicate correct type 1 error rates of the proposed likelihood ratio tests and great impact of marker density on the power. The simulation results also indicate that the methods work well for the admixed populations derived from both hybrid-isolation and continuous gene-flowing models. Finally, we observed that the genotype-based HMM performs very similarly in power as the haplotype-based HMM when the haplotypes are known and the set of markers is highly informative.  相似文献   

10.
During the last decade genome-wide association studies have proven to be a powerful approach to identifying disease-causing variants. However, for admixed populations, most current methods for association testing are based on the assumption that the effect of a genetic variant is the same regardless of its ancestry. This is a reasonable assumption for a causal variant but may not hold for the genetic variants that are tested in genome-wide association studies, which are usually not causal. The effects of noncausal genetic variants depend on how strongly their presence correlate with the presence of the causal variant, which may vary between ancestral populations because of different linkage disequilibrium patterns and allele frequencies. Motivated by this, we here introduce a new statistical method for association testing in recently admixed populations, where the effect size is allowed to depend on the ancestry of a given allele. Our method does not rely on accurate inference of local ancestry, yet using simulations we show that in some scenarios it gives a substantial increase in statistical power to detect associations. In addition, the method allows for testing for difference in effect size between ancestral populations, which can be used to help determine if a given genetic variant is causal. We demonstrate the usefulness of the method on data from the Greenlandic population.  相似文献   

11.
Genetic association studies in admixed populations may be biased if individual ancestry varies within the population and the phenotype of interest is associated with ancestry. However, recently admixed populations also offer potential benefits in association studies since markers informative for ancestry may be in linkage disequilibrium across large distances. In particular, the enhanced LD in admixed populations may be used to identify alleles that underlie a genetically determined difference in a phenotype between two ancestral populations. Asthma is known to have different prevalence and severity among ancestrally distinct populations. We investigated several asthma-related phenotypes in two ancestrally admixed populations: Mexican Americans and Puerto Ricans. We used ancestry informative markers to estimate the individual ancestry of 181 Mexican American asthmatics and 181 Puerto Rican asthmatics and tested whether individual ancestry is associated with any of these phenotypes independently of known environmental factors. We found an association between higher European ancestry and more severe asthma as measured by both forced expiratory volume at 1 second (r=-0.21, p=0.005) and by a clinical assessment of severity among Mexican Americans (OR: 1.55; 95% CI 1.25 to 1.93). We found no significant associations between ancestry and severity or drug responsiveness among Puerto Ricans. These results suggest that asthma severity may be influenced by genetic factors differentiating Europeans and Native Americans in Mexican Americans, although differing results for Puerto Ricans require further investigation.  相似文献   

12.
Genome‐wide association studies (GWAS) of common disease have been hugely successful in implicating loci that modify disease risk. The bulk of these associations have proven robust and reproducible, in part due to community adoption of statistical criteria for claiming significant genotype‐phenotype associations. As the cost of sequencing continues to drop, assembling large samples in global populations is becoming increasingly feasible. Sequencing studies interrogate not only common variants, as was true for genotyping‐based GWAS, but variation across the full allele frequency spectrum, yielding many more (independent) statistical tests. We sought to empirically determine genome‐wide significance thresholds for various analysis scenarios. Using whole‐genome sequence data, we simulated sequencing‐based disease studies of varying sample size and ancestry. We determined that future sequencing efforts in >2,000 samples of European, Asian, or admixed ancestry should set genome‐wide significance at approximately P = 5 × 10?9, and studies of African samples should apply a more stringent genome‐wide significance threshold of P = 1 × 10?9. Adoption of a revised multiple test correction will be crucial in avoiding irreproducible claims of association.  相似文献   

13.
Admixed populations arise when two or more previously isolated populations interbreed. Admixture mapping (AM) methods are used for tracing the ancestral origin of disease-susceptibility genetic loci in the admixed population such as African American and Latinos. AM is different from genome-wide association studies in that ancestry rather than genotypes are tracked in the association process. The power and sample size of AM primarily depend on proportion of admixture and differences in the risk allele frequencies among the ancestral populations. Ensuring sufficient power to detect the effect of ancestry on disease susceptibility is critical for interpretability and reliability of studies using AM approach. However, there is no power and sample size analysis tool existing for AM studies in admixed population. In this study, we developed power analysis of multiancestry AM (PAMAM) to estimate power and sample size for two-way and three-way population admixtures. PAMAM is the first web-based bioinformatics tool developed to calculate power and sample size in admixed population under a variety of genetic and disease phenotype models. It is a valuable resource for investigators to design a cost-efficient study and develop grant application to pursue AM studies. PAMAM is built on JavaScript back-end with HTML front-end. It is accessible through any modern web browser such as Firefox, Internet Explorer, and Google Chrome regardless of operating system. It is a user-friendly tool containing links for support information including user manual and examples, and freely available at https://research.cchmc.org/mershalab/PAMAM/login.html .  相似文献   

14.
We provide a general purpose family-based testing strategy for associating disease phenotypes with haplotypes when phase may be ambiguous and parental genotype data may be missing. These tests for linkage and association can be used in candidate gene studies with tightly linked markers. Our proposed weighted conditional approach extends the method described in Rabinowitz and Laird to multiple markers. It is attractive because it provides haplotype tests for family-based studies that are efficient and robust to population admixture, phenotype distribution specification, and ascertainment based on phenotypes. It can handle missing parental genotypes and/or missing phase in both offspring and parents. It yields either haplotype-specific (univariate) tests or multi-haplotype (global) tests. This extension has been implemented in the freely available software haplotype FBAT. We used the haplotype FBAT program to test for associations between asthma phenotypes and single nucleotide polymorphisms (SNPs) in the beta-2 adrenergic receptor gene. Whereas no single SNP showed significant association with asthma diagnosis or bronchodilator responsiveness (quantitative trait), a haplotype-based global test found a highly significant association with asthma diagnosis (P value <0.00005) and the measure of bronchodilator responsiveness (P value =0.016).  相似文献   

15.
Meta‐analysis of genome‐wide association studies (GWAS) has achieved great success in detecting loci underlying human diseases. Incorporating GWAS results from diverse ethnic populations for meta‐analysis, however, remains challenging because of the possible heterogeneity across studies. Conventional fixed‐effects (FE) or random‐effects (RE) methods may not be most suitable to aggregate multiethnic GWAS results because of violation of the homogeneous effect assumption across studies (FE) or low power to detect signals (RE). Three recently proposed methods, modified RE (RE‐HE) model, binary‐effects (BE) model and a Bayesian approach (Meta‐analysis of Transethnic Association [MANTRA]), show increased power over FE and RE methods while incorporating heterogeneity of effects when meta‐analyzing trans‐ethnic GWAS results. We propose a two‐stage approach to account for heterogeneity in trans‐ethnic meta‐analysis in which we clustered studies with cohort‐specific ancestry information prior to meta‐analysis. We compare this to a no‐prior‐clustering (crude) approach, evaluating type I error and power of these two strategies, in an extensive simulation study to investigate whether the two‐stage approach offers any improvements over the crude approach. We find that the two‐stage approach and the crude approach for all five methods (FE, RE, RE‐HE, BE, MANTRA) provide well‐controlled type I error. However, the two‐stage approach shows increased power for BE and RE‐HE, and similar power for MANTRA and FE compared to their corresponding crude approach, especially when there is heterogeneity across the multiethnic GWAS results. These results suggest that prior clustering in the two‐stage approach can be an effective and efficient intermediate step in meta‐analysis to account for the multiethnic heterogeneity.  相似文献   

16.
17.
Genetic association studies in admixed populations allow us to gain deeper understanding of the genetic architecture of human diseases and traits. However, population stratification, complicated linkage disequilibrium (LD) patterns, and the complex interplay of allelic and ancestry effects on phenotypic traits pose challenges in such analyses. These issues may lead to detecting spurious associations and/or result in reduced statistical power. Fortunately, if handled appropriately, these same challenges provide unique opportunities for gene mapping. To address these challenges and to take these opportunities, we propose a robust and powerful two‐step testing procedure Local Ancestry Adjusted Allelic (LAAA) association. In the first step, LAAA robustly captures associations due to allelic effect, ancestry effect, and interaction effect, allowing detection of effect heterogeneity across ancestral populations. In the second step, LAAA identifies the source of association, namely allelic, ancestry, or the combination. By jointly modeling allele, local ancestry, and ancestry‐specific allelic effects, LAAA is highly powerful in capturing the presence of interaction between ancestry and allele effect. We evaluated the validity and statistical power of LAAA through simulations over a broad spectrum of scenarios. We further illustrated its usefulness by application to the Candidate Gene Association Resource (CARe) African American participants for association with hemoglobin levels. We were able to replicate independent groups’ previously identified loci that would have been missed in CARe without joint testing. Moreover, the loci, for which LAAA detected potential effect heterogeneity, were replicated among African Americans from the Women's Health Initiative study. LAAA is freely available at https://yunliweb.its.unc.edu/LAAA .  相似文献   

18.
When studying either qualitative or quantitative traits, tests of association in the presence of linkage are necessary for fine-mapping. In a previous report, we suggested a polytomous logistic approach to testing linkage and association between a di-allelic marker and a quantitative trait locus, using genotyped triads, consisting of an individual whose quantitative trait has been measured and his or her two parents. Here we extend that approach to incorporate marker information from entire nuclear families. By computing a weighted score function instead of a maximum likelihood test, we allow for both an unspecified correlation structure between siblings and "informative" family size. Both this approach and our original approach allow for population admixture by conditioning on parental genotypes. The proposed method allows for missing parental genotype data through a multiple imputation procedure. We use simulations based on a population with admixture to compare our method to a popular non-parametric family-based association test (FBAT), testing the null of no association in the presence of linkage.  相似文献   

19.
Multiple testing corrections for imputed SNPs   总被引:1,自引:0,他引:1  
Gao X 《Genetic epidemiology》2011,35(3):154-158
Multiple testing corrections are an active research topic in genetic association studies, especially for genome-wide association studies (GWAS), where tests of association with traits are conducted at millions of imputed SNPs with estimated allelic dosages now. Failure to address multiple comparisons appropriately can introduce excess false-positive results and make subsequent studies following up those results inefficient. Permutation tests are considered the gold standard in multiple testing adjustment; however, this procedure is computationally demanding, especially for GWAS. Notably, the permutation thresholds for the huge number of estimated allelic dosages in real data sets have not been reported. Although many researchers have recently developed algorithms to rapidly approximate the permutation thresholds with accuracy similar to the permutation test, these methods have not been verified with estimated allelic dosages. In this study, we compare recently published multiple testing correction methods using 2.5M estimated allelic dosages. We also derive permutation significance levels based on 10,000 GWAS results under the null hypothesis of no association. Our results show that the simpleM method works well with estimated allelic dosages and gives the closest approximation to the permutation threshold while requiring the least computation time.  相似文献   

20.
A major challenge in genome‐wide association studies (GWASs) is to derive the multiple testing threshold when hypothesis tests are conducted using a large number of single nucleotide polymorphisms. Permutation tests are considered the gold standard in multiple testing adjustment in genetic association studies. However, it is computationally intensive, especially for GWASs, and can be impractical if a large number of random shuffles are used to ensure accuracy. Many researchers have developed approximation algorithms to relieve the computing burden imposed by permutation. One particularly attractive alternative to permutation is to calculate the effective number of independent tests, Meff, which has been shown to be promising in genetic association studies. In this study, we compare recently developed Meff methods and validate them by the permutation test with 10,000 random shuffles using two real GWAS data sets: an Illumina 1M BeadChip and an Affymetrix GeneChip® Human Mapping 500K Array Set. Our results show that the simpleM method produces the best approximation of the permutation threshold, and it does so in the shortest amount of time. We also show that Meff is indeed valid on a genome‐wide scale in these data sets based on statistical theory and significance tests. The significance thresholds derived can provide practical guidelines for other studies using similar population samples and genotyping platforms. Genet. Epidemiol. 34:100–105, 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号