首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The family-based admixture mapping test (AMT) identifies disease-related genes using family data from admixed individuals with the disease of interest (cases). The cases' genotypes at a set of markers are used to infer their DNA ancestry as it varies in blocks along the chromosomes. The test compares the cases' inferred ancestries to those expected from their family histories. Deviation between observed and expected ancestries in a region suggests the presence of a disease gene. We use a likelihood-based development of the AMT to compare it with the transmission disequilibrium test (TDT) as applied to admixed populations. The two tests have a common framework but differ significantly when the disease locus is untyped. The TDT infers disease-locus genotypes using the markers with which it is in linkage disequilibrium (LD). In contrast, the AMT infers disease locus ancestries using those of its linked markers. Thus, TDT power depends on LD between disease and marker loci, while AMT power depends on the lengths of the ancestry blocks containing the disease locus. We compare the power of the two tests when applied to cases with descent from two ancestral populations. The AMT outperforms the TDT when case marker ancestries are correctly specified and LD between disease and marker loci is less than one-third its maximal value (Delta' < 1/3). However, the TDT performs better in the presence of uncertain marker ancestries, even for weak LD between disease and marker loci (Delta' = 0.1). These findings have implications for the design of studies using admixed populations.  相似文献   

2.
In genome‐wide association studies (GWAS) genetic loci that influence complex traits are localized by inspecting associations between genotypes of genetic markers and the values of the trait of interest. On the other hand, admixture mapping, which is performed in case of populations consisting of a recent mix of two ancestral groups, relies on the ancestry information at each locus (locus‐specific ancestry). Recently it has been proposed to jointly model genotype and locus‐specific ancestry within the framework of single marker tests. Here, we extend this approach for population‐based GWAS in the direction of multimarker models. A modified version of the Bayesian information criterion is developed for building a multilocus model that accounts for the differential correlation structure due to linkage disequilibrium (LD) and admixture LD. Simulation studies and a real data example illustrate the advantages of this new approach compared to single‐marker analysis or modern model selection strategies based on separately analyzing genotype and ancestry data, as well as to single‐marker analysis combining genotypic and ancestry information. Depending on the signal strength, our procedure automatically chooses whether genotypic or locus‐specific ancestry markers are added to the model. This results in a good compromise between the power to detect causal mutations and the precision of their localization. The proposed method has been implemented in R and is available at http://www.math.uni.wroc.pl/~mbogdan/admixtures/ .  相似文献   

3.
Population substructure can lead to confounding in tests for genetic association, and failure to adjust properly can result in spurious findings. Here we address this issue of confounding by considering the impact of global ancestry (average ancestry across the genome) and local ancestry (ancestry at a specific chromosomal location) on regression parameters and relative power in ancestry‐adjusted and ‐unadjusted models. We examine theoretical expectations under different scenarios for population substructure; applying different regression models, verifying and generalizing using simulations, and exploring the findings in real‐world admixed populations. We show that admixture does not lead to confounding when the trait locus is tested directly in a single admixed population. However, if there is more complex population structure or a marker locus in linkage disequilibrium (LD) with the trait locus is tested, both global and local ancestry can be confounders. Additionally, we show the genotype parameters of adjusted and unadjusted models all provide tests for LD between the marker and trait locus, but in different contexts. The local ancestry adjusted model tests for LD in the ancestral populations, while tests using the unadjusted and the global ancestry adjusted models depend on LD in the admixed population(s), which may be enriched due to different ancestral allele frequencies. Practically, this implies that global‐ancestry adjustment should be used for screening, but local‐ancestry adjustment may better inform fine mapping and provide better effect estimates at trait loci.  相似文献   

4.
We describe statistical methods that extend the application of admixture mapping from unrelated individuals to nuclear pedigrees, allowing existing pedigree‐based collections to be fully exploited. Computational challenges have been overcome by developing a fast algorithm that exploits the factorial structure of the underlying model of ancestry transitions. This has been implemented as an extension of the program ADMIXMAP. We demonstrate the application of the method to a study of sarcoidosis in African Americans that has previously been analyzed only as an admixture mapping study restricted to unrelated individuals. Although the ancestry signals detected in this pedigree analysis are generally similar to those detected in the earlier analysis of unrelated cases, we are able to extract more information and this yields a much sharper exclusion map; using the classical criterion of an LOD score of minus 2, the pedigree analysis is able to exclude a risk ratio of 2 or more associated with African ancestry over 96% of the genome, compared with only 83% in the earlier analysis of unrelated individuals only. Although the pedigree extension of ADMIXMAP can use ancestry‐informative markers only at relatively low density, it can use imputed ancestry states from programs such as WINPOP or HAPMIX that use dense SNP marker genotypes for admixture mapping. This extends both the efficiency and the range of application of this powerful gene mapping method.  相似文献   

5.
Admixture mapping is a widely used method for localizing disease genes in African Americans. Most current methods for inferring ancestry at each locus in the genome use a few thousand single nucleotide polymorphisms (SNPs) that are very different in frequency between West Africans and European Americans, and that are required to not be in linkage disequilibrium in the ancestral populations. Modern SNP arrays provide data on hundreds of thousands of SNPs per sample, and to use these to infer ancestry, using many of the standard methods, it is necessary to choose subsets of the SNPs for analysis. Here we present panels of about 4,300 ancestry informative markers (AIMs) that are subsets respectively of SNPs on the Illumina 1 M, Illumina 650, Illumina 610, Affymetrix 6.0 and Affymetrix 5.0 arrays. To validate the usefulness of these panels, we applied them to samples that are different from the ones used to select the SNPs. The panels provide about 80% of the maximum information about African or European ancestry, even with up to 10% missing data.  相似文献   

6.
We investigated the effect of multiple susceptibility alleles at a single disease locus on the statistical power of a likelihood ratio test to detect association between alleles at a marker locus and a disease phenotype in a case-control design. Using simplifying assumptions to obtain the joint frequency distribution of marker and disease locus alleles, we present numerical results that illustrate the impact of historical variation of initial associations between marker alleles and susceptibility alleles on the power of a likelihood ratio test for association. Our results show that an increase in the number of susceptibility alleles produces a decrease in power of the likelihood ratio test. The decrease in power in the presence of multiple susceptibility alleles, however, is less for markers with multiple alleles than for markers with two alleles. We investigate the implications of this observation for tests of association based on haplotypes made up of tightly linked single-nucleotide polymorphisms (SNPs). Our results suggest that an analysis based on haplotypes can be advantageous over an analysis based on individual SNPs in the presence of multiple susceptibility alleles, particularly when linkage disequilibria between SNPs is weak. The results provide motivation for further development of statistical methods based on haplotypes for assessing the potential for association methods to identify and locate complex disease genes.  相似文献   

7.
For a dense set of genetic markers such as single nucleotide polymorphisms (SNPs) on high linkage disequilibrium within a small candidate region, a haplotype-based approach for testing association between a disease phenotype and the set of markers is attractive in reducing the data complexity and increasing the statistical power. However, due to unknown status of the underlying disease variant, a comprehensive association test may require consideration of various combinations of the SNPs, which often leads to severe multiple testing problems. In this paper, we propose a latent variable approach to test for association of multiple tightly linked SNPs in case-control studies. First, we introduce a latent variable into the penetrance model to characterize a putative disease susceptible locus (DSL) that may consist of a marker allele, a haplotype from a subset of the markers, or an allele at a putative locus between the markers. Next, through using of a retrospective likelihood to adjust for the case-control sampling ascertainment and appropriately handle the Hardy-Weinberg equilibrium constraint, we develop an expectation-maximization (EM)-based algorithm to fit the penetrance model and estimate the joint haplotype frequencies of the DSL and markers simultaneously. With the latent variable to describe a flexible role of the DSL, the likelihood ratio statistic can then provide a joint association test for the set of markers without requiring an adjustment for testing of multiple haplotypes. Our simulation results also reveal that the latent variable approach may have improved power under certain scenarios comparing with classical haplotype association methods.  相似文献   

8.
Association analysis using admixed populations imposes challenges and opportunities for disease mapping. By developing some explicit results for the variance of an allele of interest conditional on either local or global ancestry and by simulation of recently admixed genomes we evaluate power and false‐positive rates under a variety of scenarios concerning linkage disequilibrium (LD) and the presence of unmeasured variants. Pairwise LD patterns were compared between admixed and nonadmixed populations using the HapMap phase 3 data. Based on the above, we showed that as follows:
    相似文献   

9.
Qin H  Zhu X 《Genetic epidemiology》2012,36(3):235-243
When dense markers are available, one can interrogate almost every common variant across the genome via imputation and single nucleotide polymorphism (SNP) test, which has become a routine in current genome-wide association studies (GWASs). As a complement, admixture mapping exploits the long-range linkage disequilibrium (LD) generated by admixture between genetically distinct ancestral populations. It is then questionable whether admixture mapping analysis is still necessary in detecting the disease associated variants in admixed populations. We argue that admixture mapping is able to reduce the burden of massive comparisons in GWASs; it therefore can be a powerful tool to locate the disease variants with substantial allele frequency differences between ancestral populations. In this report we studied a two-stage approach, where candidate regions are defined by conducting admixture mapping at stage 1, and single SNP association tests are followed at stage 2 within the candidate regions defined at stage 1. We first established the genome-wide significance levels corresponding to the criteria to define the candidate regions at stage 1 by simulations. We next compared the power of the two-stage approach with direct association analysis. Our simulations suggest that the two-stage approach can be more powerful than the standard genome-wide association analysis when the allele frequency difference of a causal variant in ancestral populations, is larger than 0.4. Our conclusion is consistent with a theoretical prediction by Risch and Tang ([2006] Am J Hum Genet 79:S254). Surprisingly, our study also suggests that power can be improved when we use less strict criteria to define the candidate regions at stage 1.  相似文献   

10.
Current genome-wide association studies (GWAS) often involve populations that have experienced recent genetic admixture. Genotype data generated from these studies can be used to test for association directly, as in a non-admixed population. As an alternative, these data can be used to infer chromosomal ancestry, and thus allow for admixture mapping. We quantify the contribution of allele-based and ancestry-based association testing under a family-design, and demonstrate that the two tests can provide non-redundant information. We propose a joint testing procedure, which efficiently integrates the two sources information. The efficiencies of the allele, ancestry and combined tests are compared in the context of a GWAS. We discuss the impact of population history and provide guidelines for future design and analysis of GWAS in admixed populations.  相似文献   

11.
There have been many single nucleotide polymorphism-based tests suggested for association analysis in a case-control design. The possible evidence for association comprises three types of information: differences between cases and controls in allele frequencies, in parameters for Hardy Weinberg disequilibrium (HWD) and in parameters for linkage disequilibrium (LD). Here, first we find the pairwise covariances between statistics that measure these three types of information and show that the statistics are asymptotically trivariate normally distributed. Then we compare their power analytically to determine the most informative statistics according to the disease model. Our results show that differences in parameters for HWD are informative for dominant and recessive disease models, while differences in allele frequencies and in parameters for LD are generally informative except for rare recessive disease models. There is mutual independence of the statistics that detect these three differences under Hardy Weinberg equilibrium at the marker locus and linkage equilibrium between markers in the population. Knowing the pairwise covariances between the statistics makes it possible to define statistics that are mutually independent. This allows us to perform sequential analyses of the same data without the need to adjust significance levels for all the multiple analyses being performed on the same data set. As a result we can have improved flexible strategies to increase the power of genome-wide association studies without requiring the collection of a new, independent sample.  相似文献   

12.
During the last decade genome-wide association studies have proven to be a powerful approach to identifying disease-causing variants. However, for admixed populations, most current methods for association testing are based on the assumption that the effect of a genetic variant is the same regardless of its ancestry. This is a reasonable assumption for a causal variant but may not hold for the genetic variants that are tested in genome-wide association studies, which are usually not causal. The effects of noncausal genetic variants depend on how strongly their presence correlate with the presence of the causal variant, which may vary between ancestral populations because of different linkage disequilibrium patterns and allele frequencies. Motivated by this, we here introduce a new statistical method for association testing in recently admixed populations, where the effect size is allowed to depend on the ancestry of a given allele. Our method does not rely on accurate inference of local ancestry, yet using simulations we show that in some scenarios it gives a substantial increase in statistical power to detect associations. In addition, the method allows for testing for difference in effect size between ancestral populations, which can be used to help determine if a given genetic variant is causal. We demonstrate the usefulness of the method on data from the Greenlandic population.  相似文献   

13.
The usefulness of association studies for fine mapping loci with common susceptibility alleles for complex genetic diseases in outbred populations is unclear. We investigate this issue for a battery of tightly linked anonymous genetic markers spanning a candidate region centered around a disease locus, and study the joint behavior of chi-square statistics used to discover and to localize the disease locus. We used simulation methods based on a coalescent process with mutation, recombination, and genetic drift to examine the spatial distribution of markers with large noncentrality parameters in a case-control study design. Simulations with a disease allele at intermediate frequency, presumably representing an old mutation, tend to exhibit the largest noncentrality parameter values at markers near the disease locus. In contrast, simulations with a disease allele at low frequency, presumably representing a young mutation, often exhibit the largest noncentrality parameter values at markers scattered over the candidate region. In the former cases, sample sizes or marker densities sufficient to detect association are likely to lead to useful localization, whereas, in the latter case, localization of the disease locus within the candidate region is much less likely, regardless of the sample size or density of the map. The effects of increasing sample size or marker density are also investigated. Based upon a single marker analysis, we find that a simple strategy of choosing the marker with the smallest associated P value to begin a laboratory search for the disease locus performs adequately for a common disease allele. We also investigated a strategy of pooling nearby sites to form multiple allele markers. Using multiple degree of freedom chi-square tests for two or three nearby sites, we found no clear advantage of this form of pooling over a single marker analysis. Genet. Epidemiol. 20:432-457, 2001. Published by Wiley-Liss, 2001.  相似文献   

14.
Characterization of genetic admixture of populations in the Americas and the Caribbean is of interest for anthropological, epidemiological, and historical reasons. Asthma has a higher prevalence and is more severe in populations with a high African component. Association of African ancestry with asthma has been demonstrated. We estimated admixture proportions of samples from six trihybrid populations of African descent and determined the relationship between African ancestry and asthma and total serum IgE levels (tIgE). We genotyped 237 ancestry informative markers in asthmatics and nonasthmatic controls from Barbados (190/277), Jamaica (177/529), Brazil (40/220), Colombia (508/625), African Americans from New York (207/171), and African Americans from Baltimore/Washington, D.C. (625/757). We estimated individual ancestries and evaluated genetic stratification using Structure and principal component analysis. Association of African ancestry and asthma and tIgE was evaluated by regression analysis. Mean ± SD African ancestry ranged from 0.76 ± 0.10 among Barbadians to 0.33 ± 0.13 in Colombians. The European component varied from 0.14 ± 0.05 among Jamaicans and Barbadians to 0.26 ± 0.08 among Colombians. African ancestry was associated with risk for asthma in Colombians (odds ratio (OR) = 4.5, P = 0.001) Brazilians (OR = 136.5, P = 0.003), and African Americans of New York (OR: 4.7; P = 0.040). African ancestry was also associated with higher tIgE levels among Colombians (β = 1.3, P = 0.04), Barbadians (β = 3.8, P = 0.03), and Brazilians (β = 1.6, P = 0.03). Our findings indicate that African ancestry can account for, at least in part, the association between asthma and its associated trait, tIgE levels.  相似文献   

15.
The increasing availability of maps of dense polymorphic markers makes use of haplotype data in family-based association analyses an attractive alternative to single marker association tests. We describe a novel class of statistics designed to test for an association between marker haplotypes and a qualitative trait using the parent-parent-affected-offspring trio design. Our haplotype runs test (HRT) is based on consecutive allele-sharing between pairs of haplotypes. We assign weights according to the relative frequencies of the alleles for which the two haplotypes match. Herein, we compare the HRT to the maximum-identity-length-contrast (MILC) statistic, the single-locus transmission/disequilibrium test (TDT), and the generalized test of transmission disequilibrium for haplotype data, as implemented in the software TRANSMIT, using both simulated data and published haplotype data from the recessive disorder ataxia-telangiectasia. Our simulation results suggest that the HRT outperforms the MILC and that the HRT provides comparable power to the TDT and TRANSMIT when the number of distinct founder haplotypes with a disease susceptibility allele is small but substantially outperforms the TDT and TRANSMIT when the number of distinct founder haplotypes with a disease susceptibility allele is even of modest size.  相似文献   

16.
The National Human Genome Research Institute's catalog of published genome‐wide association studies (GWAS) lists over 10,000 genetic variants collectively associated with over 800 human diseases or traits. Most of these GWAS have been conducted in European‐ancestry populations. Findings gleaned from these studies have led to identification of disease‐associated loci and biologic pathways involved in disease etiology. In multiple instances, these genomic findings have led to the development of novel medical therapies or evidence for prescribing a given drug as the appropriate treatment for a given individual beyond phenotypic appearances or socially defined constructs of race or ethnicity. Such findings have implications for populations throughout the globe and GWAS are increasingly being conducted in more diverse populations. A major challenge for investigators seeking to follow up genomic findings between diverse populations is discordant patterns of linkage disequilibrium (LD). We provide an overview of common measures of LD and opportunities for their use in novel methods designed to address challenges associated with following up GWAS conducted in European‐ancestry populations in African‐ancestry populations or, more generally, between populations with discordant LD patterns. We detail the strengths and weaknesses associated with different approaches. We also describe application of these strategies in follow‐up studies of populations with concordant LD patterns (replication) or discordant LD patterns (transferability) as well as fine‐mapping studies. We review application of these methods to a variety of traits and diseases.  相似文献   

17.
Genetic association studies in admixed populations allow us to gain deeper understanding of the genetic architecture of human diseases and traits. However, population stratification, complicated linkage disequilibrium (LD) patterns, and the complex interplay of allelic and ancestry effects on phenotypic traits pose challenges in such analyses. These issues may lead to detecting spurious associations and/or result in reduced statistical power. Fortunately, if handled appropriately, these same challenges provide unique opportunities for gene mapping. To address these challenges and to take these opportunities, we propose a robust and powerful two‐step testing procedure Local Ancestry Adjusted Allelic (LAAA) association. In the first step, LAAA robustly captures associations due to allelic effect, ancestry effect, and interaction effect, allowing detection of effect heterogeneity across ancestral populations. In the second step, LAAA identifies the source of association, namely allelic, ancestry, or the combination. By jointly modeling allele, local ancestry, and ancestry‐specific allelic effects, LAAA is highly powerful in capturing the presence of interaction between ancestry and allele effect. We evaluated the validity and statistical power of LAAA through simulations over a broad spectrum of scenarios. We further illustrated its usefulness by application to the Candidate Gene Association Resource (CARe) African American participants for association with hemoglobin levels. We were able to replicate independent groups’ previously identified loci that would have been missed in CARe without joint testing. Moreover, the loci, for which LAAA detected potential effect heterogeneity, were replicated among African Americans from the Women's Health Initiative study. LAAA is freely available at https://yunliweb.its.unc.edu/LAAA .  相似文献   

18.
19.
Admixed populations arise when two or more previously isolated populations interbreed. Admixture mapping (AM) methods are used for tracing the ancestral origin of disease-susceptibility genetic loci in the admixed population such as African American and Latinos. AM is different from genome-wide association studies in that ancestry rather than genotypes are tracked in the association process. The power and sample size of AM primarily depend on proportion of admixture and differences in the risk allele frequencies among the ancestral populations. Ensuring sufficient power to detect the effect of ancestry on disease susceptibility is critical for interpretability and reliability of studies using AM approach. However, there is no power and sample size analysis tool existing for AM studies in admixed population. In this study, we developed power analysis of multiancestry AM (PAMAM) to estimate power and sample size for two-way and three-way population admixtures. PAMAM is the first web-based bioinformatics tool developed to calculate power and sample size in admixed population under a variety of genetic and disease phenotype models. It is a valuable resource for investigators to design a cost-efficient study and develop grant application to pursue AM studies. PAMAM is built on JavaScript back-end with HTML front-end. It is accessible through any modern web browser such as Firefox, Internet Explorer, and Google Chrome regardless of operating system. It is a user-friendly tool containing links for support information including user manual and examples, and freely available at https://research.cchmc.org/mershalab/PAMAM/login.html .  相似文献   

20.
We describe a novel method for inferring the local ancestry of admixed individuals from dense genome‐wide single nucleotide polymorphism data. The method, called MULTIMIX, allows multiple source populations, models population linkage disequilibrium between markers and is applicable to datasets in which the sample and source populations are either phased or unphased. The model is based upon a hidden Markov model of switches in ancestry between consecutive windows of loci. We model the observed haplotypes within each window using a multivariate normal distribution with parameters estimated from the ancestral panels. We present three methods to fit the model—Markov chain Monte Carlo sampling, the Expectation Maximization algorithm, and a Classification Expectation Maximization algorithm. The performance of our method on individuals simulated to be admixed with European and West African ancestry shows it to be comparable to HAPMIX, the ancestry calls of the two methods agreeing at 99.26% of loci across the three parameter groups. In addition to it being faster than HAPMIX, it is also found to perform well over a range of extent of admixture in a simulation involving three ancestral populations. In an analysis of real data, we estimate the contribution of European, West African and Native American ancestry to each locus in the Mexican samples of HapMap, giving estimates of ancestral proportions that are consistent with those previously reported.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号