首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 125 毫秒
1.
Genome‐wide association studies (GWAS) have identified many single nucleotide polymorphisms (SNPs) associated with complex traits. However, the genetic heritability of most of these traits remains unexplained. To help guide future studies, we address the crucial question of whether future GWAS can detect new SNP associations and explain additional heritability given the new availability of larger GWAS SNP arrays, imputation, and reduced genotyping costs. We first describe the pairwise and imputation coverage of all SNPs in the human genome by commercially available GWAS SNP arrays, using the 1000 Genomes Project as a reference. Next, we describe the findings from 6 years of GWAS of 172 chronic diseases, calculating the power to detect each of them while taking array coverage and sample size into account. We then calculate the power to detect these SNP associations under different conditions using improved coverage and/or sample sizes. Finally, we estimate the percentages of SNP associations and heritability previously detected and detectable by future GWAS under each condition. Overall, we estimated that previous GWAS have detected less than one‐fifth of all GWAS‐detectable SNPs underlying chronic disease. Furthermore, increasing sample size has a much larger impact than increasing coverage on the potential of future GWAS to detect additional SNP‐disease associations and heritability.  相似文献   

2.
Genome‐wide association studies (GWAS) have been a standard practice in identifying single nucleotide polymorphisms (SNPs) for disease susceptibility. We propose a new approach, termed integrative GWAS (iGWAS) that exploits the information of gene expressions to investigate the mechanisms of the association of SNPs with a disease phenotype, and to incorporate the family‐based design for genetic association studies. Specifically, the relations among SNPs, gene expression, and disease are modeled within the mediation analysis framework, which allows us to disentangle the genetic effect on a disease phenotype into two parts: an effect mediated through a gene expression (mediation effect, ME) and an effect through other biological mechanisms or environment‐mediated mechanisms (alternative effect, AE). We develop omnibus tests for the ME and AE that are robust to underlying true disease models. Numerical studies show that the iGWAS approach is able to facilitate discovering genetic association mechanisms, and outperforms the SNP‐only method for testing genetic associations. We conduct a family‐based iGWAS of childhood asthma that integrates genetic and genomic data. The iGWAS approach identifies six novel susceptibility genes (MANEA, MRPL53, LYCAT, ST8SIA4, NDFIP1, and PTCH1) using the omnibus test with false discovery rate less than 1%, whereas no gene using SNP‐only analyses survives with the same cut‐off. The iGWAS analyses further characterize that genetic effects of these genes are mostly mediated through their gene expressions. In summary, the iGWAS approach provides a new analytic framework to investigate the mechanism of genetic etiology, and identifies novel susceptibility genes of childhood asthma that were biologically meaningful.  相似文献   

3.
The primary circulating form of vitamin D is 25‐hydroxy vitamin D (25(OH)D), a modifiable trait linked with a growing number of chronic diseases. In addition to environmental determinants of 25(OH)D, including dietary sources and skin ultraviolet B (UVB) exposure, twin‐ and family‐based studies suggest that genetics contribute substantially to vitamin D variability with heritability estimates ranging from 43% to 80%. Genome‐wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) located in four gene regions associated with 25(OH)D. These SNPs collectively explain only a fraction of the heritability in 25(OH)D estimated by twin‐ and family‐based studies. Using 25(OH)D concentrations and GWAS data on 5,575 subjects drawn from five cohorts, we hypothesized that genome‐wide data, in the form of (1) a polygenic score comprised of hundreds or thousands of SNPs that do not individually reach GWAS significance, or (2) a linear mixed model for genome‐wide complex trait analysis, would explain variance in measured circulating 25(OH)D beyond that explained by known genome‐wide significant 25(OH)D‐associated SNPs. GWAS identified SNPs explained 5.2% of the variation in circulating 25(OH)D in these samples and there was little evidence additional markers significantly improved predictive ability. On average, a polygenic score comprised of GWAS‐identified SNPs explained a larger proportion of variation in circulating 25(OH)D than scores comprised of thousands of SNPs that were on average, nonsignificant. Employing a linear mixed model for genome‐wide complex trait analysis explained little additional variability (range 0–22%). The absence of a significant polygenic effect in this relatively large sample suggests an oligogenetic architecture for 25(OH)D.  相似文献   

4.
5.
Although type 2 diabetes (T2D) results from metabolic defects in insulin secretion and insulin sensitivity, most of the genetic risk loci identified to date relates to insulin secretion. We reported that T2D loci influencing insulin sensitivity may be identified through interactions with insulin secretion loci, thereby leading to T2D. Here, we hypothesize that joint testing of variant main effects and interaction effects with an insulin secretion locus increases power to identify genetic interactions leading to T2D. We tested this hypothesis with an intronic MTNR1B SNP, rs10830963, which is associated with acute insulin response to glucose, a dynamic measure of insulin secretion. rs10830963 was tested for interaction and joint (main + interaction) effects with genome‐wide data in African Americans (2,452 cases and 3,772 controls) from five cohorts. Genome‐wide genotype data (Affymetrix Human Genome 6.0 array) was imputed to a 1000 Genomes Project reference panel. T2D risk was modeled using logistic regression with rs10830963 dosage, age, sex, and principal component as predictors. Joint effects were captured using the Kraft two degrees of freedom test. Genome‐wide significant (< 5 × 10?8) interaction with MTNR1B and joint effects were detected for CMIP intronic SNP rs17197883 (Pinteraction = 1.43 × 10?8; Pjoint = 4.70 × 10?8). CMIP variants have been nominally associated with T2D, fasting glucose, and adiponectin in individuals of East Asian ancestry, with high‐density lipoprotein, and with waist‐to‐hip ratio adjusted for body mass index in Europeans. These data support the hypothesis that additional genetic factors contributing to T2D risk, including insulin sensitivity loci, can be identified through interactions with insulin secretion loci.  相似文献   

6.
Genome‐wide association studies (GWAS) that draw samples from multiple studies with a mixture of relationship structures are becoming more common. Analytical methods exist for using mixed‐sample data, but few methods have been proposed for the analysis of genotype‐by‐environment (G×E) interactions. Using GWAS data from a study of sarcoidosis susceptibility genes in related and unrelated African Americans, we explored the current analytic options for genotype association testing in studies using both unrelated and family‐based designs. We propose a novel method—generalized least squares (GLX)—to estimate both SNP and G×E interaction effects for categorical environmental covariates and compared this method to generalized estimating equations (GEE), logistic regression, the Cochran–Armitage trend test, and the WQLS and MQLS methods. We used simulation to demonstrate that the GLX method reduces type I error under a variety of pedigree structures. We also demonstrate its superior power to detect SNP effects while offering computational advantages and comparable power to detect G×E interactions versus GEE. Using this method, we found two novel SNPs that demonstrate a significant genome‐wide interaction with insecticide exposure—rs10499003 and rs7745248, located in the intronic and 3' UTR regions of the FUT9 gene on chromosome 6q16.1.  相似文献   

7.
Chronic obstructive pulmonary disease (COPD) is a progressive disease with both environmental and genetic risk factors. Genome‐wide association studies (GWAS) have identified multiple genomic regions influencing risk of COPD. To thoroughly investigate the genetic etiology of COPD, however, it is also important to explore the role of copy number variants (CNVs) because the presence of structural variants can alter gene expression and can be causal for some diseases. Here, we investigated effects of polymorphic CNVs on quantitative measures of pulmonary function and chest computed tomography (CT) phenotypes among subjects enrolled in COPDGene, a multisite study. COPDGene subjects consist of roughly one‐third African American (AA) and two‐thirds non‐Hispanic white adult smokers (with or without COPD). We estimated CNVs using PennCNV on 9,076 COPDGene subjects using Illumina's Omni‐Express genome‐wide marker array. We tested for association between polymorphic CNV components (defined as disjoint intervals of copy number regions) for several quantitative phenotypes associated with COPD within each racial group. Among the AAs, we identified a polymorphic CNV on chromosome 5q35.2 located between two genes (FAM153B and SIMK1, but also harboring several pseudo‐genes) giving genome‐wide significance in tests of association with total lung capacity (TLCCT) as measured by chest CT scans. This is the first study of genome‐wide association tests of polymorphic CNVs and TLCCT. Although the ARIC cohort did not have the phenotype of TLCCT, we found similar counts of CNV deletions and amplifications among AA and European subjects in this second cohort.  相似文献   

8.
Genome‐wide association studies (GWASs) commonly use marginal association tests for each single‐nucleotide polymorphism (SNP). Because these tests treat SNPs as independent, their power will be suboptimal for detecting SNPs hidden by linkage disequilibrium (LD). One way to improve power is to use a multiple regression model. However, the large number of SNPs preclude simultaneous fitting with multiple regression, and subset regression is infeasible because of an exorbitant number of candidate subsets. We therefore propose a new method for detecting hidden SNPs having significant yet weak marginal association in a multiple regression model. Our method begins by constructing a bidirected graph locally around each SNP that demonstrates a moderately sized marginal association signal, the focal SNPs. Vertexes correspond to SNPs, and adjacency between vertexes is defined by an LD measure. Subsequently, the method collects from each graph all shortest paths to the focal SNP. Finally, for each shortest path the method fits a multiple regression model to all the SNPs lying in the path and tests the significance of the regression coefficient corresponding to the terminal SNP in the path. Simulation studies show that the proposed method can detect susceptibility SNPs hidden by LD that go undetected with marginal association testing or with existing multivariate methods. When applied to real GWAS data from the Alzheimer's Disease Neuroimaging Initiative (ADNI), our method detected two groups of SNPs: one in a region containing the apolipoprotein E (APOE) gene, and another in a region close to the semaphorin 5A (SEMA5A) gene.  相似文献   

9.
Polygenic prediction using genome‐wide SNPs can provide high prediction accuracy for complex traits. Here, we investigate the question of how to account for genetic ancestry when conducting polygenic prediction. We show that the accuracy of polygenic prediction in structured populations may be partly due to genetic ancestry. However, we hypothesized that explicitly modeling ancestry could improve polygenic prediction accuracy. We analyzed three GWAS of hair color (HC), tanning ability (TA), and basal cell carcinoma (BCC) in European Americans (sample size from 7,440 to 9,822) and considered two widely used polygenic prediction approaches: polygenic risk scores (PRSs) and best linear unbiased prediction (BLUP). We compared polygenic prediction without correction for ancestry to polygenic prediction with ancestry as a separate component in the model. In 10‐fold cross‐validation using the PRS approach, the R2 for HC increased by 66% (0.0456–0.0755; P < 10−16), the R2 for TA increased by 123% (0.0154 to 0.0344; P < 10−16), and the liability‐scale R2 for BCC increased by 68% (0.0138–0.0232; P < 10−16) when explicitly modeling ancestry, which prevents ancestry effects from entering into each SNP effect and being overweighted. Surprisingly, explicitly modeling ancestry produces a similar improvement when using the BLUP approach, which fits all SNPs simultaneously in a single variance component and causes ancestry to be underweighted. We validate our findings via simulations, which show that the differences in prediction accuracy will increase in magnitude as sample sizes increase. In summary, our results show that explicitly modeling ancestry can be important in both PRS and BLUP prediction.  相似文献   

10.
Genome‐wide scans of nucleotide variation in human subjects are providing an increasing number of replicated associations with complex disease traits. Most of the variants detected have small effects and, collectively, they account for a small fraction of the total genetic variance. Very large sample sizes are required to identify and validate findings. In this situation, even small sources of systematic or random error can cause spurious results or obscure real effects. The need for careful attention to data quality has been appreciated for some time in this field, and a number of strategies for quality control and quality assurance (QC/QA) have been developed. Here we extend these methods and describe a system of QC/QA for genotypic data in genome‐wide association studies (GWAS). This system includes some new approaches that (1) combine analysis of allelic probe intensities and called genotypes to distinguish gender misidentification from sex chromosome aberrations, (2) detect autosomal chromosome aberrations that may affect genotype calling accuracy, (3) infer DNA sample quality from relatedness and allelic intensities, (4) use duplicate concordance to infer SNP quality, (5) detect genotyping artifacts from dependence of Hardy‐Weinberg equilibrium test P‐values on allelic frequency, and (6) demonstrate sensitivity of principal components analysis to SNP selection. The methods are illustrated with examples from the “Gene Environment Association Studies” (GENEVA) program. The results suggest several recommendations for QC/QA in the design and execution of GWAS. Genet. Epidemiol. 34: 591–602, 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号