首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The trend test under the additive model is commonly used when a case-control genetic association study is carried out. However, for many complex diseases, the underlying genetic models are unknown and a mis-specification of the genetic model may result in a substantial loss of power. MAX3 has been proposed as an efficiency robust test against genetic model uncertainty which takes the maximum absolute value of the trend test statistics under the recessive, additive, and dominant models. Besides its popularity, little attention has been paid to the adjustment of covariates in this test and existing approaches all depend on the estimators of parameters of interest which may be seriously biased if the individuals are divided into a large number of partial tables stratified by covariates. In this article, we propose a modified MAX3 test based on the Mantel-Haenszel test (MHT). This new test avoids estimating the nuisance parameters induced by the covariates; thus, it is valid under both large and small numbers of partial tables while still enjoys the property of efficiency robustness. The asymptotic distribution of the test under the null hypothesis of no association is also derived; thus the corresponding asymptotic P-value of the statistic can be easily calculated. Besides, we prove that this new test can be equally derived through a conditional likelihood. As a result, the original MAX3 based on the trend tests or the matching trend tests can be treated as a special case and generally incorporated into the newly proposed test. Simulation results show that the modified MAX3 can keep the correct size under the null hypothesis and is more efficiency robustness than any single MHT optimal for a specified genetic model under the alternative hypothesis. Two real examples corresponding to the large and small number of partial tables scenarios, respectively, are analyzed using the proposed method. A type 2 diabetes mellitus data set is also analyzed to evaluate the performance of the proposed test under the GWAS criteria.  相似文献   

2.
Inferring haplotypes from genotype data is commonly undertaken in population genetic association studies. Within such studies the importance of accounting for uncertainty in the inference of haplotypes is well recognised. We investigate the effectiveness of correcting for uncertainty using simple methods based on the output provided by the PHASE haplotype inference methodology. In case-control analyses investigating non-Hodgkin lymphoma and haplotypes associated with immune regulation we find little effect of making adjustment for uncertainty in inferred haplotypes. Using simulation we introduce a higher degree of haplotype uncertainty than was present in our study data. The simulation represents two genetic loci, physically close on a chromosome, forming haplotypes. Considering a range of allele frequencies, degrees of linkage between the loci, and frequency of missing genotype data, we detail the characteristics of genetic regions which may be susceptible to the influence of haplotype uncertainty. Within our evaluation we find that bias is avoided by considering haplotype probabilities or using multiple imputation, provided that for each of these methods haplotypes are inferred separately for case and control populations; furthermore using multiple imputation provides the facility to incorporate haplotype uncertainty in the estimation of confidence intervals. We discuss the implications of our findings within the context of the complexity of haplotype inference for larger marker rich regions as would typically be encountered in genetic analyses.  相似文献   

3.
Trend tests for genetic association using a matched case-control design are studied, which allows for a variable number of controls per case. However, the tests depend on the scores based on the underlying genetic model, thus it may result in substantial loss of power when the model is misspecified. Since the mode of inheritance may be unknown for complex diseases, robust trend tests in matched case-control studies are developed. Simulation is conducted to compare the trend tests and the robust trend tests under various genetic models. The results are applied to detect candidate-gene association using an example from a case-control aetiologic study of sarcoidosis.  相似文献   

4.
Meta-analysis of population-based genetic association studies is often challenged by obstacles associated with the underlying inheritance model. For a simple genetic variant with two alleles, a recessive, dominant or co-dominant model is typically assumed. In the absence of a strong biological rationale for a particular inheritance model, a recently suggested inheritance-model-free approach can be implemented. To enable a flexible choice among these models, summary results from each of the three genotypes are required. Incompatibility of the data across studies because of different inheritance models is a common problem. For instance, if the underlying model is dominant, studies that have assumed the recessive model and presented the results accordingly, have so far been excluded from the meta-analysis.We show how to combine data and make inferences under any inheritance model, irrespective of the models assumed within each study and the way that data are presented. Within a Bayesian framework we describe prospective models for binary and continuous outcomes, and retrospective models for binary outcomes. The methods exploit an assumption of Hardy-Weinberg equilibrium, prior information about genotype prevalence or assumption of a specific inheritance model. On application to meta-analyses of the associations between a polymorphism in the lipoprotein lipase gene and coronary heart disease or high-density lipoprotein cholesterol, we observe substantial gains in precision when there is a large proportion of studies in which different inheritance models have been assumed.  相似文献   

5.
Most findings from genome‐wide association studies (GWAS) are consistent with a simple disease model at a single nucleotide polymorphism, in which each additional copy of the risk allele increases risk by the same multiplicative factor, in contrast to dominance or interaction effects. As others have noted, departures from this multiplicative model are difficult to detect. Here, we seek to quantify this both analytically and empirically. We show that imperfect linkage disequilibrium (LD) between causal and marker loci distorts disease models, with the power to detect such departures dropping off very quickly: decaying as a function of r4, where r2 is the usual correlation between the causal and marker loci, in contrast to the well‐known result that power to detect a multiplicative effect decays as a function of r2. We perform a simulation study with empirical patterns of LD to assess how this disease model distortion is likely to impact GWAS results. Among loci where association is detected, we observe that there is reasonable power to detect substantial deviations from the multiplicative model, such as for dominant and recessive models. Thus, it is worth explicitly testing for such deviations routinely. Genet. Epidemiol. 35: 278‐290, 2011. © 2011 Wiley‐Liss, Inc.  相似文献   

6.
Meta‐analyses of genetic association studies are usually performed using a single polymorphism at a time, even though in many cases the individual studies report results from partially overlapping sets of polymorphisms. We present here a multipoint (or multilocus) method for multivariate meta‐analysis of published population‐based case‐control association studies. The method is derived by extending the general method for multivariate meta‐analysis and allows for multivariate modelling of log(odds ratios (OR)) derived from several polymorphisms that are in linkage disequilibrium (LD). The method is presented in a genetic model‐free approach, although it can also be used by assuming a genetic model of inheritance beforehand. Furthermore, the method is presented in a unified framework and is easily applied to both discrete outcomes (using the OR), as well as to meta‐analyses of a continuous outcome (using the mean difference). The main innovation of the method is the analytical calculation of the within‐studies covariances between estimates derived from linked polymorphisms. The only requirement is that of an external estimate for the degree of pairwise LD between the polymorphisms under study, which can be obtained from the same published studies, from the literature or from HapMap. Thus, the method is quite simple and fast, it can be extended to an arbitrary set of polymorphisms and can be fitted in nearly all statistical packages (Stata, R/Splus and SAS). Applications in two already published meta‐analyses provide encouraging results concerning the robustness and the usefulness of the method and we expect that it would be widely used in the future. Genet. Epidemiol. 34: 702‐715, 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

7.
Testing for the Hardy–Weinberg equilibrium (HWE) is often used as an initial step for checking the quality of genotyping. When testing the HWE for case‐control data, the impact of a potential genetic association between the marker and the disease must be controlled for otherwise the results may be biased. Li and Li [2008] proposed a likelihood ratio test (LRT) that accounts for this potential genetic association and it is more powerful than the commonly used control‐only χ2 test. However, the LRT is not efficient when the marker is independent of the disease, and also requires numerical optimization to calculate the test statistic. In this article, we propose a novel shrinkage test for assessing the HWE. The proposed shrinkage test yields higher statistical power than the LRT when the marker is independent of or weakly associated with the disease, and converges to the LRT when the marker is strongly associated with the disease. In addition, the proposed shrinkage test has a closed form and can be easily used to test the HWE for large datasets that result from genome‐wide association studies. We compare the performance of the shrinkage test with existing methods using simulation studies, and apply the shrinkage test to a genome‐wide association dataset for Alzheimer's disease.  相似文献   

8.
Many genetic analyses are done with incomplete information; for example, unknown phase in haplotype-based association studies. Measures of the amount of available information can be used for efficient planning of studies and/or analyses. In particular, the linkage disequilibrium (LD) between two sets of markers can be interpreted as the amount of information one set of markers contains for testing allele frequency differences in the second set, and measuring LD can be viewed as quantifying information in a missing data problem. We introduce a framework for measuring the association between two sets of variables; for example, genotype data for two distinct groups of markers, or haplotype and genotype data for a given set of polymorphisms. The goal is to quantify how much information is in one data set, e.g. genotype data for a set of SNPs, for estimating parameters that are functions of frequencies in the second data set, e.g. haplotype frequencies, relative to the ideal case of actually observing the complete data, e.g. haplotypes. In the case of genotype data on two mutually exclusive sets of markers, the measure determines the amount of multi-locus LD, and is equal to the classical measure r(2), if the sets consist each of one bi-allelic marker. In general, the measures are interpreted as the asymptotic ratio of sample sizes necessary to achieve the same power in case-control testing. The focus of this paper is on case-control allele/haplotype tests, but the framework can be extended easily to other settings like regressing quantitative traits on allele/haplotype counts, or tests on genotypes or diplotypes. We highlight applications of the approach, including tools for navigating the HapMap database [The International HapMap Consortium, 2003], and genotyping strategies for positional cloning studies.  相似文献   

9.
Population‐based case‐control design has become one of the most popular approaches for conducting genome‐wide association scans for rare diseases like cancer. In this article, we propose a novel method for improving the power of the widely used single‐single‐nucleotide polymorphism (SNP) two‐degrees‐of‐freedom (2 d.f.) association test for case‐control studies by exploiting the common assumption of Hardy‐Weinberg Equilibrium (HWE) for the underlying population. A key feature of the method is that it can relax the assumed model constraints via a completely data‐adaptive shrinkage estimation approach so that the number of false‐positive results due to the departure of HWE is controlled. The method is computationally simple and is easily scalable to association tests involving hundreds of thousands or millions of genetic markers. Simulation studies as well as an application involving data from a real genome‐wide association study illustrate that the proposed method is very robust for large‐scale association studies and can improve the power for detecting susceptibility SNPs with recessive effects, when compared to existing methods. Implications of the general estimation strategy beyond the simple 2 d.f. association test are discussed. Genet. Epidemiol. 33:740–750, 2009. Published 2009 Wiley‐Liss, Inc.  相似文献   

10.
Genome-wide association studies (GWAS) have been successful in their search for common genetic variants associated with complex traits and diseases. With new advances in array technologies together with available genetic reference sets, the next generation of GWAS will extend the search for associations with uncommon SNPs (1% ≤ MAF ≤ 10%). Two possible approaches are genotyping all participants, a prohibitively expensive option for large GWAS, or using a combination of genotyping and imputation. Here, we consider a two platform method that genotypes all participants on a standard genotyping array, designed to identify common variants, and then supplements that data by genotyping only a small proportion of the participants on a platform that has higher coverage for uncommon SNPs. This subset of the study population is then included as part of the imputation reference set. To demonstrate the use of this two-platform design, we evaluate its potential efficiency using a newly available dataset containing 756 individuals genotyped on both the Illumina Human OmniExpress and Omni2.5 Quad. Although genotyping all individuals on the denser array would be ideal, we find that genotyping only 100 individuals on this array, in combination with imputation, leads to only a modest loss of power for detecting associations. However, the loss of power due to imputation can be more substantial if the relative risks for rare variants are significantly larger than those previously observed for common variants.  相似文献   

11.
With varying, but substantial, proportions of heritability remaining unexplained by summaries of single‐SNP genetic variation, there is a demand for methods that extract maximal information from genetic association studies. One source of variation that is difficult to assess is genetic interactions. A major challenge for naive detection methods is the large number of possible combinations, with a requisite need to correct for multiple testing. Assumptions of large marginal effects, to reduce the search space, may be restrictive and miss higher order interactions with modest marginal effects. In this paper, we propose a new procedure for detecting gene‐by‐gene interactions through heterogeneity in estimated low‐order (e.g., marginal) effect sizes by leveraging population structure, or ancestral differences, among studies in which the same phenotypes were measured. We implement this approach in a meta‐analytic framework, which offers numerous advantages, such as robustness and computational efficiency, and is necessary when data‐sharing limitations restrict joint analysis. We effectively apply a dimension reduction procedure that scales to allow searches for higher order interactions. For comparison to our method, which we term phylogenY‐aware Effect‐size Tests for Interactions (YETI), we adapt an existing method that assumes interacting loci will exhibit strong marginal effects to our meta‐analytic framework. As expected, YETI excels when multiple studies are from highly differentiated populations and maintains its superiority in these conditions even when marginal effects are small. When these conditions are less extreme, the advantage of our method wanes. We assess the Type‐I error and power characteristics of complementary approaches to evaluate their strengths and limitations.  相似文献   

12.
Most previous sample size calculations for case-control studies to detect genetic associations with disease assumed that the disease gene locus is known, whereas, in fact, markers are used. We calculated sample sizes for unmatched case-control and sibling case-control studies to detect an association between a biallelic marker and a disease governed by a putative biallelic disease locus. Required sample sizes increase with increasing discrepancy between the marker and disease allele frequencies, and with less-than-maximal linkage disequilibrium between the marker and disease alleles. Qualitatively similar results were found for studies of parent offspring triads based on the transmission disequilibrium test (Abel and Müller-Myhsok, 1998, Am. J. Hum. Genet. 63:664-667; Tu and Whittemore, 1999, Am. J. Hum. Genet. 64:641-649). We also studied other factors affecting required sample size, including attributable risk for the disease allele, inheritance mechanism, disease prevalence, and for sibling case-control designs, extragenetic familial aggregation of disease and recombination. The large sample-size requirements represent a formidable challenge to studies of this type.  相似文献   

13.
Population-based case-control studies measuring associations between haplotypes of single nucleotide polymorphisms (SNPs) are increasingly popular, in part because haplotypes of a few "tagging" SNPs may serve as surrogates for variation in relatively large sections of the genome. Due to current technological limitations, haplotypes in cases and controls must be inferred from unphased genotypic data. Using individual-specific inferred haplotypes as covariates in standard epidemiologic analyses (e.g., conditional logistic regression) is an attractive analysis strategy, as it allows adjustment for nongenetic covariates, provides omnibus and haplotype-specific tests of association, and can estimate haplotype and haplotype x environment interaction effects. In principle, some adjustment for the uncertainty in inferred haplotypes should be made. Via simulation, we compare the performance (bias and mean squared error of haplotype and haplotype x environment interaction effect estimates) of several analytic strategies using inferred haplotypes in the context of matched case-control data. These strategies include using only the most likely haplotype assignment, the expectation substitution approach described by Stram et al. ([2003b] Hum. Hered. 55:179-190) and others, and an improper version of multiple imputation. For relatively uncomplicated haplotype structures and moderate haplotype relative risks (/=5). An application to progesterone-receptor haplotypes and endometrial cancer further illustrates that the performance of all these methods depends on how well the observed haplotypes "tag" the unobserved causal variant.  相似文献   

14.
Meta-analysis has become a key component of well-designed genetic association studies due to the boost in statistical power achieved by combining results across multiple samples of individuals and the need to validate observed associations in independent studies. Meta-analyses of genetic association studies based on multiple SNPs and traits are subject to the same multiple testing issues as single-sample studies, but it is often difficult to adjust accurately for the multiple tests. Procedures such as Bonferroni may control the type-I error rate but will generally provide an overly harsh correction if SNPs or traits are correlated. Depending on study design, availability of individual-level data, and computational requirements, permutation testing may not be feasible in a meta-analysis framework. In this article, we present methods for adjusting for multiple correlated tests under several study designs commonly employed in meta-analyses of genetic association tests. Our methods are applicable to both prospective meta-analyses in which several samples of individuals are analyzed with the intent to combine results, and retrospective meta-analyses, in which results from published studies are combined, including situations in which (1) individual-level data are unavailable, and (2) different sets of SNPs are genotyped in different studies due to random missingness or two-stage design. We show through simulation that our methods accurately control the rate of type I error and achieve improved power over multiple testing adjustments that do not account for correlation between SNPs or traits.  相似文献   

15.
Genome‐wide association studies (GWAS) are now routinely imputed for untyped single nucleotide polymorphisms (SNPs) based on various powerful statistical algorithms for imputation trained on reference datasets. The use of predicted allele counts for imputed SNPs as the dosage variable is known to produce valid score test for genetic association. In this paper, we investigate how to best handle imputed SNPs in various modern complex tests for genetic associations incorporating gene–environment interactions. We focus on case‐control association studies where inference for an underlying logistic regression model can be performed using alternative methods that rely on varying degree on an assumption of gene–environment independence in the underlying population. As increasingly large‐scale GWAS are being performed through consortia effort where it is preferable to share only summary‐level information across studies, we also describe simple mechanisms for implementing score tests based on standard meta‐analysis of “one‐step” maximum‐likelihood estimates across studies. Applications of the methods in simulation studies and a dataset from GWAS of lung cancer illustrate ability of the proposed methods to maintain type‐I error rates for the underlying testing procedures. For analysis of imputed SNPs, similar to typed SNPs, the retrospective methods can lead to considerable efficiency gain for modeling of gene–environment interactions under the assumption of gene–environment independence. Methods are made available for public use through CGEN R software package.  相似文献   

16.
The associations between haplotypes and disease phenotypes offer valuable clues about the genetic determinants of complex diseases. It is highly challenging to make statistical inferences about these associations because of the unknown gametic phase in genotype data. We describe a general likelihood-based approach to inferring haplotype-disease associations in studies of unrelated individuals. We consider all possible phenotypes (including disease indicator, quantitative trait, and potentially censored age at onset of disease) and all commonly used study designs (including cross-sectional, case-control, cohort, nested case-control, and case-cohort). The effects of haplotypes on phenotype are characterized by appropriate regression models, which allow various genetic mechanisms and gene-environment interactions. We present the likelihood functions for all study designs and disease phenotypes under Hardy-Weinberg disequilibrium. The corresponding maximum likelihood estimators are approximately unbiased, normally distributed, and statistically efficient. We provide simple and efficient numerical algorithms to calculate the maximum likelihood estimators and their variances, and implement these algorithms in a freely available computer program. Extensive simulation studies demonstrate that the proposed methods perform well in realistic situations. An application to the Carolina Breast Cancer Study reveals significant haplotype effects and haplotype-smoking interactions in the development of breast cancer.  相似文献   

17.
Zheng G 《Statistics in medicine》2003,22(16):2657-2666
In case-control studies, the Cochran-Armitage (CA) trend test is powerful for detection of an association between a risk allele and a marker. To apply this test, a score should be assigned to the genotypes based on the genetic model. When the underlying genetic model is unknown, the trend test statistic is a function of the score. In this paper, simple procedures are given to obtain two scores (max and min), which respectively maximize and minimize the CA trend test statistics for genetic associations. These two scores can be used to examine the effect of the choice of scores on the test of no association. When the CA trend test statistic with the max (or min) score is less (or greater) than a prespecified value, the conclusion is clear: we will accept (or reject) the null hypothesis of no association for any scores used. When this value is less than the CA trend test statistic with the max score but greater than the one with the min score, the decision of whether or not to reject the null hypothesis depends on the choice of scores. In this situation, the CA trend test with a prespecified score cannot be used without careful scientific justification of the choice of scores. The use of max and min scoring schemes is applied to a real data set.  相似文献   

18.
Chen HY  Li M 《Genetic epidemiology》2011,35(8):823-830
Extreme-value sampling design that samples subjects with extremely large or small quantitative trait values is commonly used in genetic association studies. Samples in such designs are often treated as "cases" and "controls" and analyzed using logistic regression. Such a case-control analysis ignores the potential dose-response relationship between the quantitative trait and the underlying trait locus and thus may lead to loss of power in detecting genetic association. An alternative approach to analyzing such data is to model the dose-response relationship by a linear regression model. However, parameter estimation from this model can be biased, which may lead to inflated type I errors. We propose a robust and efficient approach that takes into consideration of both the biased sampling design and the potential dose-response relationship. Extensive simulations demonstrate that the proposed method is more powerful than the traditional logistic regression analysis and is more robust than the linear regression analysis. We applied our method to the analysis of a candidate gene association study on high-density lipoprotein cholesterol (HDL-C) which includes study subjects with extremely high or low HDL-C levels. Using our method, we identified several SNPs showing a stronger evidence of association with HDL-C than the traditional case-control logistic regression analysis. Our results suggest that it is important to appropriately model the quantitative traits and to adjust for the biased sampling when dose-response relationship exists in extreme-value sampling designs.  相似文献   

19.
Meta‐analysis is now an essential tool for genetic association studies, allowing them to combine large studies and greatly accelerating the pace of genetic discovery. Although the standard meta‐analysis methods perform equivalently as the more cumbersome joint analysis under ideal settings, they result in substantial power loss under unbalanced settings with various case–control ratios. Here, we investigate the power loss problem by the standard meta‐analysis methods for unbalanced studies, and further propose novel meta‐analysis methods performing equivalently to the joint analysis under both balanced and unbalanced settings. We derive improved meta‐score‐statistics that can accurately approximate the joint‐score‐statistics with combined individual‐level data, for both linear and logistic regression models, with and without covariates. In addition, we propose a novel approach to adjust for population stratification by correcting for known population structures through minor allele frequencies. In the simulated gene‐level association studies under unbalanced settings, our method recovered up to 85% power loss caused by the standard methods. We further showed the power gain of our methods in gene‐level tests with 26 unbalanced studies of age‐related macular degeneration . In addition, we took the meta‐analysis of three unbalanced studies of type 2 diabetes as an example to discuss the challenges of meta‐analyzing multi‐ethnic samples. In summary, our improved meta‐score‐statistics with corrections for population stratification can be used to construct both single‐variant and gene‐level association studies, providing a useful framework for ensuring well‐powered, convenient, cross‐study analyses.  相似文献   

20.
In genetic association studies it is becoming increasingly imperative to have large sample sizes to identify and replicate genetic effects. To achieve these sample sizes, many research initiatives are encouraging the collaboration and combination of several existing matched and unmatched case–control studies. Thus, it is becoming more common to compare multiple sets of controls with the same case group or multiple case groups to validate or confirm a positive or negative finding. Usually, a naive approach of fitting separate models for each case–control comparison is used to make inference about disease–exposure association. But, this approach does not make use of all the observed data and hence could lead to inconsistent results. The problem is compounded when a common case group is used in each case–control comparison. An alternative to fitting separate models is to use a polytomous logistic model but, this model does not combine matched and unmatched case–control data. Thus, we propose a polytomous logistic regression approach based on a latent group indicator and a conditional likelihood to do a combined analysis of matched and unmatched case–control data. We use simulation studies to evaluate the performance of the proposed method and a case–control study of multiple myeloma and Inter‐Leukin‐6 as an example. Our results indicate that the proposed method leads to a more efficient homogeneity test and a pooled estimate with smaller standard error. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号