首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Genome-wide association studies (GWAS) have successfully identified thousands of genetic variants contributing to disease and other phenotypes. However, significant obstacles hamper our ability to elucidate causal variants, identify genes affected by causal variants, and characterize the mechanisms by which genotypes influence phenotypes. The increasing availability of genome-wide functional annotation data is providing unique opportunities to incorporate prior information into the analysis of GWAS to better understand the impact of variants on disease etiology. Although there have been many advances in incorporating prior information into prioritization of trait-associated variants in GWAS, functional annotation data have played a secondary role in the joint analysis of GWAS and molecular (i.e., expression) quantitative trait loci (eQTL) data in assessing evidence for association. To address this, we develop a novel mediation framework, iFunMed, to integrate GWAS and eQTL data with the utilization of publicly available functional annotation data. iFunMed extends the scope of standard mediation analysis by incorporating information from multiple genetic variants at a time and leveraging variant-level summary statistics. Data-driven computational experiments convey how informative annotations improve single-nucleotide polymorphism (SNP) selection performance while emphasizing robustness of iFunMed to noninformative annotations. Application to Framingham Heart Study data indicates that iFunMed is able to boost detection of SNPs with mediation effects that can be attributed to regulatory mechanisms.  相似文献   

3.
We study the problem of testing for single marker‐multiple phenotype associations based on genome‐wide association study (GWAS) summary statistics without access to individual‐level genotype and phenotype data. For most published GWASs, because obtaining summary data is substantially easier than accessing individual‐level phenotype and genotype data, while often multiple correlated traits have been collected, the problem studied here has become increasingly important. We propose a powerful adaptive test and compare its performance with some existing tests. We illustrate its applications to analyses of a meta‐analyzed GWAS dataset with three blood lipid traits and another with sex‐stratified anthropometric traits, and further demonstrate its potential power gain over some existing methods through realistic simulation studies. We start from the situation with only one set of (possibly meta‐analyzed) genome‐wide summary statistics, then extend the method to meta‐analysis of multiple sets of genome‐wide summary statistics, each from one GWAS. We expect the proposed test to be useful in practice as more powerful than or complementary to existing methods.  相似文献   

4.
We propose a two-stage approach to analyze genome-wide association data in order to identify a set of promising single-nucleotide polymorphisms (SNPs). In stage one, we select a list of top signals from single SNP analyses by controlling false discovery rate. In stage two, we use the least absolute shrinkage and selection operator (LASSO) regression to reduce false positives. The proposed approach was evaluated using simulated quantitative traits based on genome-wide SNP data on 8,861 Caucasian individuals from the Atherosclerosis Risk in Communities (ARIC) Study. Our first stage, targeted at controlling false negatives, yields better power than using Bonferroni-corrected significance level. The LASSO regression reduces the number of significant SNPs in stage two: it reduces false-positive SNPs and it reduces true-positive SNPs also at simulated causal loci due to linkage disequilibrium. Interestingly, the LASSO regression preserves the power from stage one, i.e., the number of causal loci detected from the LASSO regression in stage two is almost the same as in stage one, while reducing false positives further. Real data on systolic blood pressure in the ARIC study was analyzed using our two-stage approach which identified two significant SNPs, one of which was reported to be genome-significant in a meta-analysis containing a much larger sample size. On the other hand, a single SNP association scan did not yield any significant results.  相似文献   

5.
Genome‐wide association studies (GWAS) have confirmed the ubiquitous existence of genetic heterogeneity for common disease: multiple common genetic variants have been identified to be associated, while many more are yet expected to be uncovered. However, the single SNP (single‐nucleotide polymorphism) based trend test (or its variants) that has been dominantly used in GWAS is based on contrasting the allele frequency difference between the case and control groups, completely ignoring possible genetic heterogeneity. In spite of the widely accepted notion of genetic heterogeneity, we are not aware of any previous attempt to apply genetic heterogeneity motivated methods in GWAS. Here, to explicitly account for unknown genetic heterogeneity, we applied a mixture model based single‐SNP test to the Wellcome Trust Case Control Consortium (WTCCC) GWAS data with traits of Crohn's disease, bipolar disease, coronary artery disease, and type 2 diabetes, identifying much larger numbers of significant SNPs and risk loci for each trait than those of the popular trend test, demonstrating potential power gain of the mixture model based test.  相似文献   

6.
To date, thousands of genetic variants to be associated with numerous human traits and diseases have been identified by genome-wide association studies (GWASs). The GWASs focus on testing the association between single trait and genetic variants. However, the analysis of multiple traits and single nucleotide polymorphisms (SNPs) might reflect physiological process of complex diseases and the corresponding study is called pleiotropy association analysis. Modern day GWASs report only summary statistics instead of individual-level phenotype and genotype data to avoid logistical and privacy issues. Existing methods for combining multiple phenotypes GWAS summary statistics mainly focus on low-dimensional phenotypes while lose power in high-dimensional cases. To overcome this defect, we propose two kinds of truncated tests to combine multiple phenotypes summary statistics. Extensive simulations show that the proposed methods are robust and powerful when the dimension of the phenotypes is high and only part of the phenotypes are associated with the SNPs. We apply the proposed methods to blood cytokines data collected from Finnish population. Results show that the proposed tests can identify additional genetic markers that are missed by single trait analysis.  相似文献   

7.
Genome‐wide association studies (GWAS) for complex diseases have focused primarily on single‐trait analyses for disease status and disease‐related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL‐cholesterol, HDL‐cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual‐level data. Here, we develop metaUSAT (where USAT is unified score‐based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual‐level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P‐value for association and is computationally efficient for implementation at a genome‐wide level. Simulation experiments show that metaUSAT maintains proper type‐I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D‐GENES studies, metaUSAT detected genome‐wide significant loci beyond the ones identified by univariate analyses. Evidence from larger studies suggest that the variants additionally detected by our test are, indeed, associated with lipid levels in humans. In summary, metaUSAT can provide novel insights into the genetic architecture of a common disease or traits.  相似文献   

8.
There has been an increasing interest in joint association testing of multiple traits for possible pleiotropic effects. However, even in the presence of pleiotropy, most of the existing methods cannot distinguish direct and indirect effects of a genetic variant, say single‐nucleotide polymorphism (SNP), on multiple traits, and a conditional analysis of a trait adjusting for other traits is perhaps the simplest and most common approach to addressing this question. However, without individual‐level genotypic and phenotypic data but with only genome‐wide association study (GWAS) summary statistics, as typical with most large‐scale GWAS consortium studies, we are not aware of any existing method for such a conditional analysis. We propose such a conditional analysis, offering formulas of necessary calculations to fit a joint linear regression model for multiple quantitative traits. Furthermore, our method can also accommodate conditional analysis on multiple SNPs in addition to on multiple quantitative traits, which is expected to be useful for fine mapping. We provide numerical examples based on both simulated and real GWAS data to demonstrate the effectiveness of our proposed approach, and illustrate possible usefulness of conditional analysis by contrasting its result differences from those of standard marginal analyses.  相似文献   

9.
10.
11.
Genetic variants associated with fasting glucose in European ancestry populations are increasingly well understood. However, the nature of the associations between these single nucleotide polymorphisms (SNPs) and fasting glucose in other racial and ethnic groups is unclear. We sought to examine regions previously identified to be associated with fasting glucose in Caucasian genome-wide association studies (GWAS) across multiple ethnicities in the Multiethnic Study of Atherosclerosis (MESA). Nondiabetic MESA participants with fasting glucose measured at the baseline exam and with GWAS genotyping were included; 2,349 Caucasians, 664 individuals of Chinese descent, 1,366 African Americans, and 1,171 Hispanics. Genotype data were generated from the Affymetrix 6.0 array and imputation in IMPUTE. Fasting glucose was regressed on SNP dosage data in each ethnic group adjusting for age, gender, MESA study center, and ethnic-specific principal components. SNPs from the three gene regions with the strongest associations to fasting glucose in previous Caucasian GWAS (MTNR1B / GCK / G6PC2) were examined in depth. There was limited power to replicate associations in other ethnic groups due to smaller allele frequencies and limited sample size; SNP associations may also have differed across ethnic groups due to differing linkage disequilibrium patterns with causal variants. rs10830963 in MTNR1B and rs4607517 in GCK demonstrated consistent magnitude and direction of association with fasting glucose across ethnic groups, although the associations were often not nominally significant. In conclusion, certain SNPs in MTNR1B and GCK demonstrate consistent effects across four racial and ethnic groups, narrowing the putative region for these causal variants.  相似文献   

12.
13.
Hypertension is a complex disorder caused by genetic and environmental risk factors. Recently, genome-wide association studies (GWASs) identified more than 100 genetic variants for blood pressure traits and hypertension. However, the interactions between these genetic variants and environmental factors have not been systematically investigated. Therefore, we examined the interaction between genetic and environmental risk factors in blood pressure traits using the genetic risk score (GRS). Two Korean community-based cohorts, Cohort I (KARE; N = 8,840) and Cohort II (CAVAS; N = 9,599), were used for this study, and GRSs were calculated from 42 GWAS single-nucleotide polymorphisms (SNPs) that were validated for their association in these cohorts. We calculated GRSs in both ways by considering the effect sizes of each SNP (weighted GRS) and not considering the effect sizes (unweighted GRS). The unweighted GRS was strongly associated with systolic blood pressure, diastolic blood pressure, and hypertension (p = 9.03 × 10 –47, p = 9.41 × 10 –48, and p = 3.22 × 10 –55 by meta-analysis, respectively) and the weighted GRS showed the similar results. The environmental factors of body mass index, waist circumference, and drinking status were significantly associated with blood pressure traits, and the interaction between these factors and GRSs were examined. However, no interactions were found with either the GRS or the individual SNPs considered for the GRS. Our findings show that it is challenging to find GRS–environment interactions regarding blood pressure traits.  相似文献   

14.
While genome-wide association studies (GWASs) have been widely used to uncover associations between diseases and genetic variants, standard SNP-level GWASs often lack the power to identify SNPs that individually have a moderate effect size but jointly contribute to the disease. To overcome this problem, pathway-based GWASs methods have been developed as an alternative strategy that complements SNP-level approaches. We propose a Bayesian method that uses the generalized fused hierarchical structured variable selection prior to identify pathways associated with the disease using SNP-level summary statistics. Our prior has the flexibility to take in pathway structural information so that it can model the gene-level correlation based on prior biological knowledge, an important feature that makes it appealing compared to existing pathway-based methods. Using simulations, we show that our method outperforms competing methods in various scenarios, particularly when we have pathway structural information that involves complex gene-gene interactions. We apply our method to the Wellcome Trust Case Control Consortium Crohn's disease GWAS data, demonstrating its practical application to real data.  相似文献   

15.
Recently, large scale genome‐wide association study (GWAS) meta‐analyses have boosted the number of known signals for some traits into the tens and hundreds. Typically, however, variants are only analysed one‐at‐a‐time. This complicates the ability of fine‐mapping to identify a small set of SNPs for further functional follow‐up. We describe a new and scalable algorithm, joint analysis of marginal summary statistics (JAM), for the re‐analysis of published marginal summary stactistics under joint multi‐SNP models. The correlation is accounted for according to estimates from a reference dataset, and models and SNPs that best explain the complete joint pattern of marginal effects are highlighted via an integrated Bayesian penalized regression framework. We provide both enumerated and Reversible Jump MCMC implementations of JAM and present some comparisons of performance. In a series of realistic simulation studies, JAM demonstrated identical performance to various alternatives designed for single region settings. In multi‐region settings, where the only multivariate alternative involves stepwise selection, JAM offered greater power and specificity. We also present an application to real published results from MAGIC (meta‐analysis of glucose and insulin related traits consortium) – a GWAS meta‐analysis of more than 15,000 people. We re‐analysed several genomic regions that produced multiple significant signals with glucose levels 2 hr after oral stimulation. Through joint multivariate modelling, JAM was able to formally rule out many SNPs, and for one gene, ADCY5, suggests that an additional SNP, which transpired to be more biologically plausible, should be followed up with equal priority to the reported index.  相似文献   

16.
Obesity is a well-established risk factor for endometrial cancer, the most common gynecologic malignancy. Recent genome-wide association studies (GWAS) have identified multiple genetic markers for obesity. The authors evaluated the association of obesity-related single nucleotide polymorphisms (SNPs) with endometrial cancer using GWAS data from their recently completed study, the Shanghai Endometrial Cancer Genetics Study, which comprised 832 endometrial cancer cases and 2,049 controls (1996-2005). Thirty-five SNPs previously associated with obesity or body mass index (BMI; weight (kg)/height (m)(2)) at a minimum significance level of ≤5 × 10(-7) in the US National Human Genome Research Institute's GWAS catalog (http://genome.gov/gwastudies) and representing 26 unique loci were evaluated by either direct genotyping or imputation. The authors found that for 22 of the 26 unique loci tested (84.6%), the BMI-associated risk variants were present at a higher frequency in cases than in population controls (P = 0.0003). Multiple regression analysis showed that 9 of 35 BMI-associated variants, representing 7 loci, were significantly associated (P ≤ 0.05) with the risk of endometrial cancer; for all but 1 SNP, the direction of association was consistent with that found for BMI. For consistent SNPs, the allelic odds ratios ranged from 1.15 to 1.29. These 7 loci are in the SEC16B/RASAL, TMEM18, MSRA, SOX6, MTCH2, FTO, and MC4R genes. The associations persisted after adjustment for BMI, suggesting that genetic markers of obesity provide value in addition to BMI in predicting endometrial cancer risk.  相似文献   

17.
Polygenic risk scores (PRSs) are a method to summarize the additive trait variance captured by a set of SNPs, and can increase the power of set‐based analyses by leveraging public genome‐wide association study (GWAS) datasets. PRS aims to assess the genetic liability to some phenotype on the basis of polygenic risk for the same or different phenotype estimated from independent data. We propose the application of PRSs as a set‐based method with an additional component of adjustment for linkage disequilibrium (LD), with potential extension of the PRS approach to analyze biologically meaningful SNP sets. We call this method POLARIS: POlygenic Ld‐Adjusted RIsk Score. POLARIS identifies the LD structure of SNPs using spectral decomposition of the SNP correlation matrix and replaces the individuals' SNP allele counts with LD‐adjusted dosages. Using a raw genotype dataset together with SNP effect sizes from a second independent dataset, POLARIS can be used for set‐based analysis. MAGMA is an alternative set‐based approach employing principal component analysis to account for LD between markers in a raw genotype dataset. We used simulations, both with simple constructed and real LD‐structure, to compare the power of these methods. POLARIS shows more power than MAGMA applied to the raw genotype dataset only, but less or comparable power to combined analysis of both datasets. POLARIS has the advantages that it produces a risk score per person per set using all available SNPs, and aims to increase power by leveraging the effect sizes from the discovery set in a self‐contained test of association in the test dataset.  相似文献   

18.
Li M  Ye C  Fu W  Elston RC  Lu Q 《Genetic epidemiology》2011,35(6):457-468
The genetic etiology of complex human diseases has been commonly viewed as a process that involves multiple genetic variants, environmental factors, as well as their interactions. Statistical approaches, such as the multifactor dimensionality reduction (MDR) and generalized MDR (GMDR), have recently been proposed to test the joint association of multiple genetic variants with either dichotomous or continuous traits. In this study, we propose a novel Forward U-Test to evaluate the combined effect of multiple loci on quantitative traits with consideration of gene-gene/gene-environment interactions. In this new approach, a U-Statistic-based forward algorithm is first used to select potential disease-susceptibility loci and then a weighted U-statistic is used to test the joint association of the selected loci with the disease. Through a simulation study, we found the Forward U-Test outperformed GMDR in terms of greater power. Aside from that, our approach is less computationally intensive, making it feasible for high-dimensional gene-gene/gene-environment research. We illustrate our method with a real data application to nicotine dependence (ND), using three independent datasets from the Study of Addiction: Genetics and Environment. Our gene-gene interaction analysis of 155 SNPs in 67 candidate genes identified two SNPs, rs16969968 within gene CHRNA5 and rs1122530 within gene NTRK2, jointly associated with the level of ND (P-value = 5.31e-7). The association, which involves essential interaction, is replicated in two independent datasets with P-values of 1.08e-5 and 0.02, respectively. Our finding suggests that joint action may exist between the two gene products.  相似文献   

19.
A key aim for current genome-wide association studies (GWAS) is to interrogate the full spectrum of genetic variation underlying human traits, including rare variants, across populations. Deep whole-genome sequencing is the gold standard to fully capture genetic variation, but remains prohibitively expensive for large sample sizes. Array genotyping interrogates a sparser set of variants, which can be used as a scaffold for genotype imputation to capture a wider set of variants. However, imputation quality depends crucially on reference panel size and genetic distance from the target population. Here, we consider sequencing a subset of GWAS participants and imputing the rest using a reference panel that includes both sequenced GWAS participants and an external reference panel. We investigate how imputation quality and GWAS power are affected by the number of participants sequenced for admixed populations (African and Latino Americans) and European population isolates (Sardinians and Finns), and identify powerful, cost-effective GWAS designs given current sequencing and array costs. For populations that are well-represented in existing reference panels, we find that array genotyping alone is cost-effective and well-powered to detect common- and rare-variant associations. For poorly represented populations, sequencing a subset of participants is often most cost-effective, and can substantially increase imputation quality and GWAS power.  相似文献   

20.
Genome‐wide association studies (GWAS) have identified many single nucleotide polymorphisms (SNPs) associated with complex traits. However, the genetic heritability of most of these traits remains unexplained. To help guide future studies, we address the crucial question of whether future GWAS can detect new SNP associations and explain additional heritability given the new availability of larger GWAS SNP arrays, imputation, and reduced genotyping costs. We first describe the pairwise and imputation coverage of all SNPs in the human genome by commercially available GWAS SNP arrays, using the 1000 Genomes Project as a reference. Next, we describe the findings from 6 years of GWAS of 172 chronic diseases, calculating the power to detect each of them while taking array coverage and sample size into account. We then calculate the power to detect these SNP associations under different conditions using improved coverage and/or sample sizes. Finally, we estimate the percentages of SNP associations and heritability previously detected and detectable by future GWAS under each condition. Overall, we estimated that previous GWAS have detected less than one‐fifth of all GWAS‐detectable SNPs underlying chronic disease. Furthermore, increasing sample size has a much larger impact than increasing coverage on the potential of future GWAS to detect additional SNP‐disease associations and heritability.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号