首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The genetic case-control association study of unrelated subjects is a leading method to identify single nucleotide polymorphisms (SNPs) and SNP haplotypes that modulate the risk of complex diseases. Association studies often genotype several SNPs in a number of candidate genes; we propose a two-stage approach to address the inherent statistical multiple comparisons problem. In the first stage, each gene's association with disease is summarized by a single p-value that controls a familywise error rate. In the second stage, summary p-values are adjusted for multiplicity using a false discovery rate (FDR) controlling procedure. For the first stage, we consider marginal and joint tests of SNPs and haplotypes within genes, and we construct an omnibus test that combines SNP and haplotype analysis. Simulation studies show that when disease susceptibility is conferred by a SNP, and all common SNPs in a gene are genotyped, marginal analysis of SNPs using the Simes test has similar or higher power than marginal or joint haplotype analysis. Conversely, haplotype analysis can be more powerful when disease susceptibility is conferred by a haplotype. The omnibus test tracks the more powerful of the two approaches, which is generally unknown. Multiple testing balances the desire for statistical power against the implicit costs of false positive results, which up to now appear to be common in the literature.  相似文献   

2.
Genome‐wide association studies (GWASs) commonly use marginal association tests for each single‐nucleotide polymorphism (SNP). Because these tests treat SNPs as independent, their power will be suboptimal for detecting SNPs hidden by linkage disequilibrium (LD). One way to improve power is to use a multiple regression model. However, the large number of SNPs preclude simultaneous fitting with multiple regression, and subset regression is infeasible because of an exorbitant number of candidate subsets. We therefore propose a new method for detecting hidden SNPs having significant yet weak marginal association in a multiple regression model. Our method begins by constructing a bidirected graph locally around each SNP that demonstrates a moderately sized marginal association signal, the focal SNPs. Vertexes correspond to SNPs, and adjacency between vertexes is defined by an LD measure. Subsequently, the method collects from each graph all shortest paths to the focal SNP. Finally, for each shortest path the method fits a multiple regression model to all the SNPs lying in the path and tests the significance of the regression coefficient corresponding to the terminal SNP in the path. Simulation studies show that the proposed method can detect susceptibility SNPs hidden by LD that go undetected with marginal association testing or with existing multivariate methods. When applied to real GWAS data from the Alzheimer's Disease Neuroimaging Initiative (ADNI), our method detected two groups of SNPs: one in a region containing the apolipoprotein E (APOE) gene, and another in a region close to the semaphorin 5A (SEMA5A) gene.  相似文献   

3.
We consider detecting associations between a trait and multiple single nucleotide polymorphisms (SNPs) in linkage disequilibrium (LD). To maximize the use of information contained in multiple SNPs while minimizing the cost of large degrees of freedom (DF) in testing multiple parameters, we first theoretically explore the sum test derived under a working assumption of a common association strength between the trait and each SNP, testing on the corresponding parameter with only one DF. Under the scenarios that the association strengths between the trait and the SNPs are close to each other (and in the same direction), as considered by Wang and Elston [Am. J. Hum. Genet. [2007] 80:353–360], we show with simulated data that the sum test was powerful as compared to several existing tests; otherwise, the sum test might have much reduced power. To overcome the limitation of the sum test, based on our theoretical analysis of the sum test, we propose five new tests that are closely related to each other and are shown to consistently perform similarly well across a wide range of scenarios. We point out the close connection of the proposed tests to the Goeman test. Furthermore, we derive the asymptotic distributions of the proposed tests so that P‐values can be easily calculated, in contrast to the use of computationally demanding permutations or simulations for the Goeman test. A distinguishing feature of the five new tests is their use of a diagonal working covariance matrix, rather than a full covariance matrix as used in the usual Wald or score test. We recommend the routine use of two of the new tests, along with several other tests, to detect disease associations with multiple linked SNPs. Genet. Epidemiol. 33:497–507, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

4.
The role of haplotypes in candidate gene studies   总被引:24,自引:0,他引:24  
Human geneticists working on systems for which it is possible to make a strong case for a set of candidate genes face the problem of whether it is necessary to consider the variation in those genes as phased haplotypes, or whether the one-SNP-at-a-time approach might perform as well. There are three reasons why the phased haplotype route should be an improvement. First, the protein products of the candidate genes occur in polypeptide chains whose folding and other properties may depend on particular combinations of amino acids. Second, population genetic principles show us that variation in populations is inherently structured into haplotypes. Third, the statistical power of association tests with phased data is likely to be improved because of the reduction in dimension. However, in reality it takes a great deal of extra work to obtain valid haplotype phase information, and inferred phase information may simply compound the errors. In addition, if the causal connection between SNPs and a phenotype is truly driven by just a single SNP, then the haplotype-based approach may perform worse than the one-SNP-at-a-time approach. Here we examine some of the factors that affect haplotype patterns in genes, how haplotypes may be inferred, and how haplotypes have been useful in the context of testing association between candidate genes and complex traits.  相似文献   

5.
Recent studies have shown that quantitative phenotypes may be influenced not only by multiple single nucleotide polymorphisms (SNPs) within a gene but also by the interaction between SNPs at unlinked genes. We propose a new statistical approach that can detect gene‐gene interactions at the allelic level which contribute to the phenotypic variation in a quantitative trait. By testing for the association of allelic combinations at multiple unlinked loci with a quantitative trait, we can detect the SNP allelic interaction whether or not it can be detected as a main effect. Our proposed method assigns a score to unrelated subjects according to their allelic combination inferred from observed genotypes at two or more unlinked SNPs, and then tests for the association of the allelic score with a quantitative trait. To investigate the statistical properties of the proposed method, we performed a simulation study to estimate type I error rates and power and demonstrated that this allelic approach achieves greater power than the more commonly used genotypic approach to test for gene‐gene interaction. As an example, the proposed method was applied to data obtained as part of a candidate gene study of sodium retention by the kidney. We found that this method detects an interaction between the calcium‐sensing receptor gene (CaSR), the chloride channel gene (CLCNKB) and the Na, K, 2Cl cotransporter gene (CLC12A1) that contributes to variation in diastolic blood pressure. Genet. Epidemiol. 2009. © 2008 Wiley‐Liss, Inc.  相似文献   

6.
Using the Genetic Analysis Workshop 12 simulated data, we contrasted results for association tests in nuclear families and extended pedigrees using single‐nucleotide polymorphism (SNP) data, and we compared results for different trait definitions, for outbred and isolate populations, and for SNP and microsatellite data. SNPs in major genes 1 and 6 were analyzed using transmission disequilibrium testing (TDT) [Spielman et al., Am J Hum Genet 52:506–16, 1993], sibship disequilibrium testing (SDT) [Horvath and Laird, Am J Hum Genet 63:1886–97, 1998], family‐based association testing (FBAT) [Horvath et al., Eur J Hum Genet 9:301–6, 2001], and a chi‐square analysis of founders. TDT and SDT were applied in a sample of independent nuclear families, while FBAT was applied in extended pedigrees. SNPs and microsatellites were analyzed with dichotomous and quantitative trait definitions using FBAT in the isolate and outbred populations. The results of the TDT, SDT, and FBAT analyses are comparable using SNP data to identify the disease gene. However, these tests of association were not helpful in discriminating between functional and non‐functional SNPs in disequilibrium. SNP data were able to identify association with affection status in a gene that influences the liability directly (MG6), but did not perform as well when assessing association with affection status in a gene that influences the outcome only through a quantitative trait (MGI). Association with MGI was observed using the SNP data when the outcome was defined quantitatively. Microsatellite data were relatively unsuccessful in identifying association with the markers in the region of a major gene. The magnitude of the associations between SNPs and the dichotomous or quantitative trait definitions were similar in the outbred and isolated populations. © 2001 Wiley‐Liss, Inc.  相似文献   

7.
Hao K  Xu X  Laird N  Wang X  Xu X 《Genetic epidemiology》2004,26(1):22-30
At the current stage, a large number of single nucleotide polymorphisms (SNPs) have been deployed in searching for genes underlying complex diseases. A powerful method is desirable for efficient analysis of SNP data. Recently, a novel method for multiple SNP association test using a combination of allelic association (AA) and Hardy-Weinberg disequilibrium (HWD) has been proposed. However, the power of this test has not been systematically examined. In this study, we conducted a simulation study to further evaluate the statistical power of the new procedure, as well as of the influence of the HWD on its performance. The simulation examined the scenarios of multiple disease SNPs among a candidate pool, assuming different parameters including allele frequencies and risk ratios, dominant, additive, and recessive genetic models, and the existence of gene-gene interactions and linkage disequilibrium (LD). We also evaluated the performance of this test in capturing real disease associated SNPs, when a significant global P value is detected. Our results suggest that this new procedure is more powerful than conventional single-point analyses with correction of multiple testing. However, inclusion of HWD reduces the power under most circumstances. We applied the novel association test procedure to a case-control study of preterm delivery (PTD), examining the effects of 96 candidate gene SNPs concurrently, and detected a global P value of 0.0250 by using Cochran-Armitage chi(2)s as "starting" statistics in the procedure. In the following single point analysis, SNPs on IL1RN, IL1R2, ESR1, Factor 5, and OPRM1 genes were identified as possible risk factors in PTD.  相似文献   

8.
Genome‐wide association studies allow detection of non‐genotyped disease‐causing variants through testing of nearby genotyped SNPs. This approach may fail when there are no genotyped SNPs in strong LD with the causal variant. Several genotyped SNPs in weak LD with the causal variant may, however, considered together, provide equivalent information. This observation motivates popular but computationally intensive approaches based on imputation or haplotyping. Here we present a new method and accompanying software designed for this scenario. Our approach proceeds by selecting, for each genotyped “anchor” SNP, a nearby genotyped “partner” SNP, chosen via a specific algorithm we have developed. These two SNPs are used as predictors in linear or logistic regression analysis to generate a final significance test. In simulations, our method captures much of the signal captured by imputation, while taking a fraction of the time and disc space, and generating a smaller number of false‐positives. We apply our method to a case/control study of severe malaria genotyped using the Affymetrix 500K array. Previous analysis showed that fine‐scale sequencing of a Gambian reference panel in the region of the known causal locus, followed by imputation, increased the signal of association to genome‐wide significance levels. Our method also increases the signal of association from to . Our method thus, in some cases, eliminates the need for more complex methods such as sequencing and imputation, and provides a useful additional test that may be used to identify genetic regions of interest.  相似文献   

9.
When many correlated traits are measured the potential exists to discover the coordinated control of these traits via genotyped polymorphisms. A common statistical approach to this problem involves assessing the relationship between each phenotype and each single nucleotide polymorphism (SNP) individually (PHN); and taking a Bonferroni correction for the effective number of independent tests conducted. Alternatively, one can apply a dimension reduction technique, such as estimation of principal components, and test for an association with the principal components of the phenotypes (PCP) rather than the individual phenotypes. Building on the work of Lange and colleagues we develop an alternative method based on the principal component of heritability (PCH). For each SNP the PCH approach reduces the phenotypes to a single trait that has a higher heritability than any other linear combination of the phenotypes. As a result, the association between a SNP and derived trait is often easier to detect than an association with any of the individual phenotypes or the PCP. When applied to unrelated subjects, PCH has a drawback. For each SNP it is necessary to estimate the vector of loadings that maximize the heritability over all phenotypes. We develop a method of iterated sample splitting that uses one portion of the data for training and the remainder for testing. This cross-validation approach maintains the type I error control and yet utilizes the data efficiently, resulting in a powerful test for association.  相似文献   

10.
Weir BL 《Genetic epidemiology》2001,21(Z1):S415-S420
A range of study designs, using unrelated or family controls, were used to investigate the pattern of association with disease of single nucleotide polymorphisms (SNPs) within candidate gene 1 (simulated data). Strong evidence of disease association at the functional locus was detected using all study designs, and in the "general" but not the "isolated" population the functional polymorphism displayed considerably higher association than surrounding SNPs. There was much variation in the strength of association of SNPs with disease, up to 70% of which was explained by SNP allele frequency and distance from the functional polymorphism. Some common polymorphisms very close to the functional locus however showed no association with disease. Analysis of short haplotypes of SNPs reduced but did not totally remove this feature.  相似文献   

11.
A novel method for joint detection of association caused by linkage disequilibrium (LD) and estimation of both recombination fraction and linkage disequilibrium parameters was compared to several existing implementations of the transmission/disequilibrium test (TDT) and modifications of the TDT in the simulated genetic isolate data from Genetic Analysis Workshop 12. The first completely genotyped trio of affected child and parents was selected from each family in each replicate so that the TDT tests are valid tests of linkage and association, rather than being only valid as tests for linkage. In general, power to detect LD using the genome‐wide scan markers was inadequate in the individual replicate samples, but the power was better when analyzing several SNP markers in candidate gene 1. © 2001 Wiley‐Liss, Inc.  相似文献   

12.
We provide a general purpose family-based testing strategy for associating disease phenotypes with haplotypes when phase may be ambiguous and parental genotype data may be missing. These tests for linkage and association can be used in candidate gene studies with tightly linked markers. Our proposed weighted conditional approach extends the method described in Rabinowitz and Laird to multiple markers. It is attractive because it provides haplotype tests for family-based studies that are efficient and robust to population admixture, phenotype distribution specification, and ascertainment based on phenotypes. It can handle missing parental genotypes and/or missing phase in both offspring and parents. It yields either haplotype-specific (univariate) tests or multi-haplotype (global) tests. This extension has been implemented in the freely available software haplotype FBAT. We used the haplotype FBAT program to test for associations between asthma phenotypes and single nucleotide polymorphisms (SNPs) in the beta-2 adrenergic receptor gene. Whereas no single SNP showed significant association with asthma diagnosis or bronchodilator responsiveness (quantitative trait), a haplotype-based global test found a highly significant association with asthma diagnosis (P value <0.00005) and the measure of bronchodilator responsiveness (P value =0.016).  相似文献   

13.
A goal of association analysis is to determine whether variation in a particular candidate region or gene is associated with liability to complex disease. To evaluate such candidates, ubiquitous Single Nucleotide Polymorphisms (SNPs) are useful. It is critical, however, to select a set of SNPs that are in substantial linkage disequilibrium (LD) with all other polymorphisms in the region. Whether there is an ideal statistical framework to test such a set of ‘tag SNPs’ for association is unknown. Compared to tests for association based on frequencies of haplotypes, recent evidence suggests tests for association based on linear combinations of the tag SNPs (Hotelling T2 test) are more powerful. Following this logical progression, we wondered if single‐locus tests would prove generally more powerful than the regression‐based tests? We answer this question by investigating four inferential procedures: the maximum of a series of test statistics corrected for multiple testing by the Bonferroni procedure, TB, or by permutation of case‐control status, TP; a procedure that tests the maximum of a smoothed curve fitted to the series of of test statistics, TS; and the Hotelling T2 procedure, which we call TR. These procedures are evaluated by simulating data like that from human populations, including realistic levels of LD and realistic effects of alleles conferring liability to disease. We find that power depends on the correlation structure of SNPs within a gene, the density of tag SNPs, and the placement of the liability allele. The clearest pattern emerges between power and the number of SNPs selected. When a large fraction of the SNPs within a gene are tested, and multiple SNPs are highly correlated with the liability allele, TS has better power. Using a SNP selection scheme that optimizes power but also requires a substantial number of SNPs to be genotyped (roughly 10–20 SNPs per gene), power of TP is generally superior to that for the other procedures, including TR. Finally, when a SNP selection procedure that targets a minimal number of SNPs per gene is applied, the average performances of TP and TR are indistinguishable. Genet. Epidemiol. © 2005 Wiley‐Liss, Inc.  相似文献   

14.
15.
Variable selection is growing in importance with the advent of high throughput genotyping methods requiring analysis of hundreds to thousands of single nucleotide polymorphisms (SNPs) and the increased interest in using these genetic studies to better understand common, complex diseases. Up to now, the standard approach has been to analyze the genotypes for each SNP individually to look for an association with a disease. Alternatively, combinations of SNPs or haplotypes are analyzed for association. Another added complication in studying complex diseases or phenotypes is that genetic risk for the disease is often due to multiple SNPs in various locations on the chromosome with small individual effects that may have a collectively large effect on the phenotype. Hence, multi-locus SNP models, as opposed to single SNP models, may better capture the true underlying genotypic-phenotypic relationship. Thus, innovative methods for determining which SNPs to include in the model are needed. The goal of this article is to describe several methods currently available for variable and model selection using Bayesian approaches and to illustrate their application for genetic association studies using both real and simulated candidate gene data for a complex disease. In particular, Bayesian model averaging (BMA), stochastic search variable selection (SSVS), and Bayesian variable selection (BVS) using a reversible jump Markov chain Monte Carlo (MCMC) for candidate gene association studies are illustrated using a study of age-related macular degeneration (AMD) and simulated data.  相似文献   

16.
Recent advances in molecular genetic technology allow for detailed characterization of genetic variation and easy cost-efficient accumulation of such data, even for large human samples. One such advance that presents incredible opportunities for identifying associations between genetic polymorphisms and disease-related phenotypes is the ability to quickly type a large number of single-nucleotide polymorphisms (SNPs). Contributors to Group 10 of Genetic Analysis Workshop 14 explored the potential of SNP genotypes for the association mapping of disease-related genes in family-based studies. Using both real data involving alcoholism susceptibility, made available by the Collaborative Study on the Genetics of Alcoholism (COGA), and simulated data involving personality-disorder susceptibility, group members investigated specific methodological issues involved in association mapping, such as multiple testing, single SNPs vs. combinations and haplotypes, and the effect of linkage disequilibrium on SNP-based linkage; evaluated existing methodologies for association mapping using SNPs, short-tandem repeats (STRs), or a combination of the two; and introduced new or modified association-mapping methods, including a gamma random effects (GRE) model and the quantitative trait linkage disequilibrium (QTLD) test. These papers are unified by the application of association-based methods to analyze SNPs, microsatellite markers, or both, to identify chromosomal regions harboring genes that contribute to quantitative endophenotype variation, and thus to disease risk. Their diversity attests to the breadth and flexibility of association-mapping approaches to the genetics of complex disease.  相似文献   

17.
In a genome‐wide association study (GWAS), investigators typically focus their primary analysis on the direct (marginal) associations of each single nucleotide polymorphism (SNP) with the trait. Some SNPs that are truly associated with the trait may not be identified in this scan if they have a weak marginal effect and thus low power to be detected. However, these SNPs may be quite important in subgroups of the population defined by an environmental or personal factor, and may be detectable if such a factor is carefully considered in a gene–environment (G × E) interaction analysis. We address the question “Using a genome wide interaction scan (GWIS), can we find new genes that were not found in the primary GWAS scan?” We review commonly used approaches for conducting a GWIS in case‐control studies, and propose a new two‐step screening and testing method (EDG×E) that is optimized to find genes with a weak marginal effect. We simulate several scenarios in which our two‐step method provides 70–80% power to detect a disease locus while a marginal scan provides less than 5% power. We also provide simulations demonstrating that the EDG×E method outperforms other GWIS approaches (including case only and previously proposed two‐step methods) for finding genes with a weak marginal effect. Application of this method to a G × Sex scan for childhood asthma reveals two potentially interesting SNPs that were not identified in the marginal‐association scan. We distribute a new software program (G×Escan, available at http://biostats.usc.edu/software ) that implements this new method as well as several other GWIS approaches.  相似文献   

18.
Introduction: Genetic discoveries are validated through the meta‐analysis of genome‐wide association scans in large international consortia. Because environmental variables may interact with genetic factors, investigation of differing genetic effects for distinct levels of an environmental exposure in these large consortia may yield additional susceptibility loci undetected by main effects analysis. We describe a method of joint meta‐analysis (JMA) of SNP and SNP by Environment (SNP × E) regression coefficients for use in gene‐environment interaction studies. Methods: In testing SNP × E interactions, one approach uses a two degree of freedom test to identify genetic variants that influence the trait of interest. This approach detects both main and interaction effects between the trait and the SNP. We propose a method to jointly meta‐analyze the SNP and SNP × E coefficients using multivariate generalized least squares. This approach provides confidence intervals of the two estimates, a joint significance test for SNP and SNP × E terms, and a test of homogeneity across samples. Results: We present a simulation study comparing this method to four other methods of meta‐analysis and demonstrate that the JMA performs better than the others when both main and interaction effects are present. Additionally, we implemented our methods in a meta‐analysis of the association between SNPs from the type 2 diabetes‐associated gene PPARG and log‐transformed fasting insulin levels and interaction by body mass index in a combined sample of 19,466 individuals from five cohorts. Genet. Epidemiol. 35:11–18, 2011. © 2010 Wiley‐Liss, Inc.  相似文献   

19.
Here we summarize the contributions to Group 13 of the Genetic Analysis Workshop 15 held in St. Pete Beach, Florida, on November 12-14, 2006. The focus of this group was to identify candidate genes associated with rheumatoid arthritis or surrogate outcomes. The association methods proposed in this group were diverse, from better known approaches, such as logistic regression for single nucleotide polymorphism (SNP) analysis and haplotype sharing tests to methods less familiar to genetic epidemiologists, such as machine learning and visualization methods. The majority of papers analyzed Genetic Analysis Workshop 15 Problems 2 (rheumatoid arthritis data) and 3 (simulated data). The highlighted points of this group analyses were: (1) haplotype-based statistics can be more powerful than single SNP analysis for risk-locus localization; (2) considering linkage disequilibrium block structure in haplotype analysis may reduce the likelihood of false-positive results; and (3) visual representation of genetic models for continuous covariates may help identify SNPs associated with the underlying quantitative trait loci.  相似文献   

20.
目的 探讨结节性硬化症(tuberous sclerosis complex,TSC)相关基因TSC1、TSC2基因多态性与儿童孤独症之间的关联。 方法 利用SNaPshot基因分型技术,在97例孤独症核心家系中,对TSC1、TSC2基因上的8个标签SNP,即rs3761840、rs2809244、rs1050700、rs739441、rs2074968、rs2074969、rs2072314、rs8063461进行分型;通过FBAT软件及Haploview软件进行基于家系的单倍型分析。 结果 1)基于家系的关联分析发现8个SNPs等位基因中有2个SNPs的等位基因倾向于过传递(rs1050700 A:Z=2.708,P=0.006769;rs2074968 G:Z=3.244,P=0.001180),并且经过FDR校正后,2个SNPs仍显示出与孤独症之间存在显著关联性(校正P值分别为0.027,0.014)。2)rs3761840-rs2809244基因型的单体型A-C显示出显著的传递不平衡,双亲较少传递给子女(Z=-2.297,P=0.021629)。rs2074968-rs2072314基因型的2种单体型即 G-C及C-C均显示出显著的传递不平衡,单体型G-C能从双亲过传递给子女(Z=2.596,P=0.009444),单体型C-C则相反(Z=-3.657,P=0.000256)。 结论 TSC1、TSC2基因可能与儿童孤独症的发生存在关联。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号