首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
With the development of sequencing technologies, the direct testing of rare variant associations has become possible. Many statistical methods for detecting associations between rare variants and complex diseases have recently been developed, most of which are population‐based methods for unrelated individuals. A limitation of population‐based methods is that spurious associations can occur when there is a population structure. For rare variants, this problem can be more serious, because the spectrum of rare variation can be very different in diverse populations, as well as the current nonexistence of methods to control for population stratification in population‐based rare variant associations. A solution to the problem of population stratification is to use family‐based association tests, which use family members to control for population stratification. In this article, we propose a novel test for Testing the Optimally Weighted combination of variants based on data of Parents and Affected Children (TOW‐PAC). TOW‐PAC is a family‐based association test that tests the combined effect of rare and common variants in a genomic region, and is robust to the directions of the effects of causal variants. Simulation studies confirm that, for rare variant associations, family‐based association tests are robust to population stratification although population‐based association tests can be seriously confounded by population stratification. The results of power comparisons show that the power of TOW‐PAC increases with an increase of the number of affected children in each family and TOW‐PAC based on multiple affected children per family is more powerful than TOW based on unrelated individuals.  相似文献   

2.
Next-generation sequencing technology will soon allow sequencing the whole genome of large groups of individuals, and thus will make directly testing rare variants possible. Currently, most of existing methods for rare variant association studies are essentially testing the effect of a weighted combination of variants with different weighting schemes. Performance of these methods depends on the weights being used and no optimal weights are available. By putting large weights on rare variants and small weights on common variants, these methods target at rare variants only, although increasing evidence shows that complex diseases are caused by both common and rare variants. In this paper, we analytically derive optimal weights under a certain criterion. Based on the optimal weights, we propose a Variable Weight Test for testing the effect of an Optimally Weighted combination of variants (VW-TOW). VW-TOW aims to test the effects of both rare and common variants. VW-TOW is applicable to both quantitative and qualitative traits, allows covariates, can control for population stratification, and is robust to directions of effects of causal variants. Extensive simulation studies and application to the Genetic Analysis Workshop 17 (GAW17) data show that VW-TOW is more powerful than existing ones either for testing effects of both rare and common variants or for testing effects of rare variants only.  相似文献   

3.
Population stratification has long been recognized as an issue in genetic association studies because unrecognized population stratification can lead to both false‐positive and false‐negative findings and can obscure true association signals if not appropriately corrected. This issue can be even worse in rare variant association analyses because rare variants often demonstrate stronger and potentially different patterns of stratification than common variants. To correct for population stratification in genetic association studies, we proposed a novel method to Test the effect of an Optimally Weighted combination of variants in Admixed populations (TOWA) in which the analytically derived optimal weights can be calculated from existing phenotype and genotype data. TOWA up weights rare variants and those variants that have strong associations with the phenotype. Additionally, it can adjust for the direction of the association, and allows for local ancestry difference among study subjects. Extensive simulations show that the type I error rate of TOWA is under control in the presence of population stratification and it is more powerful than existing methods. We have also applied TOWA to a real sequencing data. Our simulation studies as well as real data analysis results indicate that TOWA is a useful tool for rare variant association analyses in admixed populations.  相似文献   

4.
Next generation sequencing technology has enabled the paradigm shift in genetic association studies from the common disease/common variant to common disease/rare‐variant hypothesis. Analyzing individual rare variants is known to be underpowered; therefore association methods have been developed that aggregate variants across a genetic region, which for exome sequencing is usually a gene. The foreseeable widespread use of whole genome sequencing poses new challenges in statistical analysis. It calls for new rare‐variant association methods that are statistically powerful, robust against high levels of noise due to inclusion of noncausal variants, and yet computationally efficient. We propose a simple and powerful statistic that combines the disease‐associated P‐values of individual variants using a weight that is the inverse of the expected standard deviation of the allele frequencies under the null. This approach, dubbed as Sigma‐P method, is extremely robust to the inclusion of a high proportion of noncausal variants and is also powerful when both detrimental and protective variants are present within a genetic region. The performance of the Sigma‐P method was tested using simulated data based on realistic population demographic and disease models and its power was compared to several previously published methods. The results demonstrate that this method generally outperforms other rare‐variant association methods over a wide range of models. Additionally, sequence data on the ANGPTL family of genes from the Dallas Heart Study were tested for associations with nine metabolic traits and both known and novel putative associations were uncovered using the Sigma‐P method.  相似文献   

5.
Advances in exome sequencing and the development of exome genotyping arrays are enabling explorations of association between rare coding variants and complex traits. To ensure power for these rare variant analyses, a variety of association tests that group variants by gene or functional unit have been proposed. Here, we extend these tests to family‐based studies. We develop family‐based burden tests, variable frequency threshold tests and sequence kernel association tests. Through simulations, we compare the performance of different tests. We describe situations where family‐based studies provide greater power than studies of unrelated individuals to detect rare variants associated with moderate to large changes in trait values. Broadly speaking, we find that when sample sizes are limited and only a modest fraction of all trait‐associated variants can be identified, family samples are more powerful. Finally, we illustrate our approach by analyzing the relationship between coding variants and levels of high‐density lipoprotein (HDL) cholesterol in 11,556 individuals from the HUNT and SardiNIA studies, demonstrating association for coding variants in the APOC3, CETP, LIPC, LIPG, and LPL genes and illustrating the value of family samples, meta‐analysis, and gene‐level tests. Our methods are implemented in freely available C++ code.  相似文献   

6.
Many association tests have been proposed for rare variants, but the choice of a powerful test is uncertain when there is limited information on the underlying genetic model. Proposed methods use either linear statistics, which are powerful when most variants are causal and have the same direction of effect, or quadratic statistics, which are more powerful in other scenarios. To achieve robustness, it is natural to combine the evidence of association from two or more complementary tests. To this end, we consider the minimum‐p and Fisher's methods of combining P‐values from linear and quadratic statistics. Extensive simulation studies show that both methods are robust across models with varying proportions of causal, deleterious, and protective rare variants, allele frequencies, and effect sizes. When the majority (>75%) of the causal effects are in the same direction (deleterious or protective), Fisher's method consistently outperforms the minimum‐p and the individual linear and quadratic tests, as well as the optimal sequence kernel association test, SKAT‐O. When the individual test has moderate power, Fisher's test has improved power for 90% of the ~5000 models considered, with >20% relative efficiency gain for 40% of the models. The maximum absolute power loss is 8% for the remaining 10% of the models. An application to the GAW17 quantitative trait Q2 data based on sequence data of the 1000 Genomes Project shows that, compared with linear and quadratic tests, Fisher's test has comparable power for all 13 functional genes and provides the best power for more than half of them.  相似文献   

7.
Most rare‐variant association tests for complex traits are applicable only to population‐based or case‐control resequencing studies. There are fewer rare‐variant association tests for family‐based resequencing studies, which is unfortunate because pedigrees possess many attractive characteristics for such analyses. Family‐based studies can be more powerful than their population‐based counterparts due to increased genetic load and further enable the implementation of rare‐variant association tests that, by design, are robust to confounding due to population stratification. With this in mind, we propose a rare‐variant association test for quantitative traits in families; this test integrates the QTDT approach of Abecasis et al. [Abecasis et al., 2000a ] into the kernel‐based SNP association test KMFAM of Schifano et al. [Schifano et al., 2012 ]. The resulting within‐family test enjoys the many benefits of the kernel framework for rare‐variant association testing, including rapid evaluation of P‐values and preservation of power when a region harbors rare causal variation that acts in different directions on phenotype. Additionally, by design, this within‐family test is robust to confounding due to population stratification. Although within‐family association tests are generally less powerful than their counterparts that use all genetic information, we show that we can recover much of this power (although still ensuring robustness to population stratification) using a straightforward screening procedure. Our method accommodates covariates and allows for missing parental genotype data, and we have written software implementing the approach in R for public use.  相似文献   

8.
Recent advances in next-generation sequencing technologies facilitate the detection of rare variants, making it possible to uncover the roles of rare variants in complex diseases. As any single rare variants contain little variation, association analysis of rare variants requires statistical methods that can effectively combine the information across variants and estimate their overall effect. In this study, we propose a novel Bayesian generalized linear model for analyzing multiple rare variants within a gene or genomic region in genetic association studies. Our model can deal with complicated situations that have not been fully addressed by existing methods, including issues of disparate effects and nonfunctional variants. Our method jointly models the overall effect and the weights of multiple rare variants and estimates them from the data. This approach produces different weights to different variants based on their contributions to the phenotype, yielding an effective summary of the information across variants. We evaluate the proposed method and compare its performance to existing methods on extensive simulated data. The results show that the proposed method performs well under all situations and is more powerful than existing approaches.  相似文献   

9.
Several methods have been proposed to increase power in rare variant association testing by aggregating information from individual rare variants (MAF < 0.005). However, how to best combine rare variants across multiple ethnicities and the relative performance of designs using different ethnic sampling fractions remains unknown. In this study, we compare the performance of several statistical approaches for assessing rare variant associations across multiple ethnicities. We also explore how different ethnic sampling fractions perform, including single‐ethnicity studies and studies that sample up to four ethnicities. We conducted simulations based on targeted sequencing data from 4,611 women in four ethnicities (African, European, Japanese American, and Latina). As with single‐ethnicity studies, burden tests had greater power when all causal rare variants were deleterious, and variance component‐based tests had greater power when some causal rare variants were deleterious and some were protective. Multiethnic studies had greater power than single‐ethnicity studies at many loci, with inclusion of African Americans providing the largest impact. On average, studies including African Americans had as much as 20% greater power than equivalently sized studies without African Americans. This suggests that association studies between rare variants and complex disease should consider including subjects from multiple ethnicities, with preference given to genetically diverse groups.  相似文献   

10.
Family‐based designs have been repeatedly shown to be powerful in detecting the significant rare variants associated with human diseases. Furthermore, human diseases are often defined by the outcomes of multiple phenotypes, and thus we expect multivariate family‐based analyses may be very efficient in detecting associations with rare variants. However, few statistical methods implementing this strategy have been developed for family‐based designs. In this report, we describe one such implementation: the multivariate family‐based rare variant association tool (mFARVAT). mFARVAT is a quasi‐likelihood‐based score test for rare variant association analysis with multiple phenotypes, and tests both homogeneous and heterogeneous effects of each variant on multiple phenotypes. Simulation results show that the proposed method is generally robust and efficient for various disease models, and we identify some promising candidate genes associated with chronic obstructive pulmonary disease. The software of mFARVAT is freely available at http://healthstat.snu.ac.kr/software/mfarvat/ , implemented in C++ and supported on Linux and MS Windows.  相似文献   

11.
Both genome-wide association study and next-generation sequencing data analyses are widely employed to identify disease susceptible common and/or rare genetic variants. Rare variants generally have large effects though they are hard to detect due to their low frequencies. Currently, many existing statistical methods for rare variants association studies employ a weighted combination scheme, which usually puts subjective weights or suboptimal weights based on some adhoc assumptions (e.g., ignoring dependence between rare variants). In this study, we analytically derived optimal weights for both common and rare variants and proposed a general and novel approach to test association between an optimally weighted combination of variants (G-TOW) in a gene or pathway for a continuous or dichotomous trait while easily adjusting for covariates. Results of the simulation studies show that G-TOW has properly controlled type I error rates and it is the most powerful test among the methods we compared when testing effects of either both rare and common variants or rare variants only. We also illustrate the effectiveness of G-TOW using the Genetic Analysis Workshop 17 (GAW17) data. Additionally, we applied G-TOW and other competitive methods to test disease-associated genes in real data of schizophrenia. The G-TOW has successfully verified genes FYN and VPS39 which are associated with schizophrenia reported in existing publications. Both of these genes are missed by the weighted sum statistic and the sequence kernel association test. Simulation study and real data analysis indicate that G-TOW is a powerful test.  相似文献   

12.
Confounding due to population substructure is always a concern in genetic association studies. Although methods have been proposed to adjust for population stratification in the context of common variation, it is unclear how well these approaches will work when interrogating rare variation. Family‐based association tests can be constructed that are robust to population stratification. For example, when considering a quantitative trait, a linear model can be used that decomposes genetic effects into between‐ and within‐family components and a test of the within‐family component is robust to population stratification. However, this within‐family test ignores between‐family information potentially leading to a loss of power. Here, we propose a family‐based two‐stage rare‐variant test for quantitative traits. We first construct a weight for each variant within a gene, or other genetic unit, based on score tests of between‐family effect parameters. These weights are then used to combine variants using score tests of within‐family effect parameters. Because the between‐family and within‐family tests are orthogonal under the null hypothesis, this two‐stage approach can increase power while still maintaining validity. Using simulation, we show that this two‐stage test can significantly improve power while correctly maintaining type I error. We further show that the two‐stage approach maintains the robustness to population stratification of the within‐family test and we illustrate this using simulations reflecting samples composed of continental and closely related subpopulations.  相似文献   

13.
Family‐based designs enriched with affected subjects and disease associated variants can increase statistical power for identifying functional rare variants. However, few rare variant analysis approaches are available for time‐to‐event traits in family designs and none of them applicable to the X chromosome. We developed novel pedigree‐based burden and kernel association tests for time‐to‐event outcomes with right censoring for pedigree data, referred to FamRATS (family‐based rare variant association tests for survival traits). Cox proportional hazard models were employed to relate a time‐to‐event trait with rare variants with flexibility to encompass all ranges and collapsing of multiple variants. In addition, the robustness of violating proportional hazard assumptions was investigated for the proposed and four current existing tests, including the conventional population‐based Cox proportional model and the burden, kernel, and sum of squares statistic (SSQ) tests for family data. The proposed tests can be applied to large‐scale whole‐genome sequencing data. They are appropriate for the practical use under a wide range of misspecified Cox models, as well as for population‐based, pedigree‐based, or hybrid designs. In our extensive simulation study and data example, we showed that the proposed kernel test is the most powerful and robust choice among the proposed burden test and the existing four rare variant survival association tests. When applied to the Diabetes Heart Study, the proposed tests found exome variants of the JAK1 gene on chromosome 1 showed the most significant association with age at onset of type 2 diabetes from the exome‐wide analysis.  相似文献   

14.
Rare variant studies are now being used to characterize the genetic diversity between individuals and may help to identify substantial amounts of the genetic variation of complex diseases and quantitative phenotypes. Family data have been shown to be powerful to interrogate rare variants. Consequently, several rare variants association tests have been recently developed for family‐based designs, but typically, these assume the normality of the quantitative phenotypes. In this paper, we present a family‐based test for rare‐variants association in the presence of non‐normal quantitative phenotypes. The proposed model relaxes the normality assumption and does not specify any parametric distribution for the marginal distribution of the phenotype. The dependence between relatives is modeled via a Gaussian copula. A score‐type test is derived, and several strategies to approximate its distribution under the null hypothesis are derived and investigated. The performance of the proposed test is assessed and compared with existing methods by simulations. The methodology is illustrated with an association study involving the adiponectin trait from the UK10K project. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

15.
It is generally known that risk variants segregate together with a disease within families, but this information has not been used in the existing statistical methods for detecting rare variants. Here we introduce two weighted sum statistics that can apply to either genome-wide association data or resequencing data for identifying rare disease variants: weights calculated based on sibpairs and odd ratios, respectively. We evaluated the two methods via extensive simulations under different disease models. We compared the proposed methods with the weighted sum statistic (WSS) proposed by Madsen and Browning, keeping the same genotyping or resequencing cost. Our methods clearly demonstrate more statistical power than the WSS. In addition, we found that using sibpair information can increase power over using only unrelated samples by more than 40%. We applied our methods to the Framingham Heart Study (FHS) and Wellcome Trust Case Control Consortium (WTCCC) hypertension datasets. Although we did not identify any genes as reaching a genome-wide significance level, we found variants in the candidate gene angiotensinogen significantly associated with hypertension at P = 6.9 × 10(-4), whereas the most significant single SNP association evidence is P = 0.063. We further applied the odds ratio weighted method to the IFIH1 gene for type-1 diabetes in the WTCCC data. Our method yielded a P-value of 4.82 × 10(-4), much more significant than that obtained by haplotype-based methods. We demonstrated that family data are extremely informative in searching for rare variants underlying complex traits, and the odds ratio weighted sum statistic is more efficient than currently existing methods.  相似文献   

16.
In the last two decades, complex traits have become the main focus of genetic studies. The hypothesis that both rare and common variants are associated with complex traits is increasingly being discussed. Family‐based association studies using relatively large pedigrees are suitable for both rare and common variant identification. Because of the high cost of sequencing technologies, imputation methods are important for increasing the amount of information at low cost. A recent family‐based imputation method, Genotype Imputation Given Inheritance (GIGI), is able to handle large pedigrees and accurately impute rare variants, but does less well for common variants where population‐based methods perform better. Here, we propose a flexible approach to combine imputation data from both family‐ and population‐based methods. We also extend the Sequence Kernel Association Test for Rare and Common variants (SKAT‐RC), originally proposed for data from unrelated subjects, to family data in order to make use of such imputed data. We call this extension “famSKAT‐RC.” We compare the performance of famSKAT‐RC and several other existing burden and kernel association tests. In simulated pedigree sequence data, our results show an increase of imputation accuracy from use of our combining approach. Also, they show an increase of power of the association tests with this approach over the use of either family‐ or population‐based imputation methods alone, in the context of rare and common variants. Moreover, our results show better performance of famSKAT‐RC compared to the other considered tests, in most scenarios investigated here.  相似文献   

17.
There is an emerging interest in sequencing‐based association studies of multiple rare variants. Most association tests suggested in the literature involve collapsing rare variants with or without weighting. Recently, a variance‐component score test [sequence kernel association test (SKAT)] was proposed to address the limitations of collapsing method. Although SKAT was shown to outperform most of the alternative tests, its applications and power might be restricted and influenced by missing genotypes. In this paper, we suggest a new method based on testing whether the fraction of causal variants in a region is zero. The new association test, T REM, is derived from a random‐effects model and allows for missing genotypes, and the choice of weighting function is not required when common and rare variants are analyzed simultaneously. We performed simulations to study the type I error rates and power of four competing tests under various conditions on the sample size, genotype missing rate, variant frequency, effect directionality, and the number of non‐causal rare variant and/or causal common variant. The simulation results showed that T REM was a valid test and less sensitive to the inclusion of non‐causal rare variants and/or low effect common variants or to the presence of missing genotypes. When the effects were more consistent in the same direction, T REM also had better power performance. Finally, an application to the Shanghai Breast Cancer Study showed that rare causal variants at the FGFR2 gene were detected by T REM and SKAT, but T REM produced more consistent results for different sets of rare and common variants. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

18.
In anticipation of the availability of next‐generation sequencing data, there is increasing interest in investigating association between complex traits and rare variants (RVs). In contrast to association studies for common variants (CVs), due to the low frequencies of RVs, common wisdom suggests that existing statistical tests for CVs might not work, motivating the recent development of several new tests for analyzing RVs, most of which are based on the idea of pooling/collapsing RVs. However, there is a lack of evaluations of, and thus guidance on the use of, existing tests. Here we provide a comprehensive comparison of various statistical tests using simulated data. We consider both independent and correlated rare mutations, and representative tests for both CVs and RVs. As expected, if there are no or few non‐causal (i.e. neutral or non‐associated) RVs in a locus of interest while the effects of causal RVs on the trait are all (or mostly) in the same direction (i.e. either protective or deleterious, but not both), then the simple pooled association tests (without selecting RVs and their association directions) and a new test called kernel‐based adaptive clustering (KBAC) perform similarly and are most powerful; KBAC is more robust than simple pooled association tests in the presence of non‐causal RVs; however, as the number of non‐causal CVs increases and/or in the presence of opposite association directions, the winners are two methods originally proposed for CVs and a new test called C‐alpha test proposed for RVs, each of which can be regarded as testing on a variance component in a random‐effects model. Interestingly, several methods based on sequential model selection (i.e. selecting causal RVs and their association directions), including two new methods proposed here, perform robustly and often have statistical power between those of the above two classes. Genet. Epidemiol. 2011. © 2011 Wiley Periodicals, Inc. 35:606‐619, 2011  相似文献   

19.
Recent studies suggest that rare variants play an important role in the etiology of many traits. Although a number of methods have been developed for genetic association analysis of rare variants, they all assume a relatively homogeneous population under study. Such an assumption may not be valid for samples collected from admixed populations such asAfricanAmericans andHispanicAmericans as there is a great extent of local variation in ancestry in these populations. To ensure valid and more powerful rare variant association tests performed in admixed populations, we have developed a local ancestry‐based weighted dosage test, which is able to take into account local ancestry of rare alleles, uncertainties in rare variant imputation when imputed data are included, and the direction of effect that rare variants exert on phenotypic outcome. We used simulated sequence data to show that our proposed test has controlled typeIerror rates, whereas naïve application of existing rare variants tests and tests that adjust for global ancestry lead to inflated type I error rates. We showed that our test has higher power than tests without proper adjustment of ancestry. We also applied the proposed method to a candidate gene study on low‐density lipoprotein cholesterol. Our results suggest that it is important to appropriately control for potential population stratification induced by local ancestry difference in the analysis of rare variants in admixed populations.  相似文献   

20.
Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X‐linked variants have been reported for complex traits. For instance, dosage compensation of X‐linked genes is often achieved via the inactivation of one allele in each X‐linked variant in females; however, some X‐linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X‐linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X‐linked variant genetic association analysis of dichotomous phenotypes with family‐based samples. The proposed methods are computationally efficient and can complete X‐linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare‐variant association analysis of the X chromosome in chronic obstructive pulmonary disease. Some promising significant X‐linked genes were identified, illustrating the practical importance of the proposed methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号