共查询到20条相似文献,搜索用时 0 毫秒
1.
Schaid DJ Sinnwell JP Jenkins GD McDonnell SK Ingle JN Kubo M Goss PE Costantino JP Wickerham DL Weinshilboum RM 《Genetic epidemiology》2012,36(1):3-16
Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. 相似文献
2.
Beaty TH Ruczinski I Murray JC Marazita ML Munger RG Hetmanski JB Murray T Redett RJ Fallin MD Liang KY Wu T Patel PJ Jin SC Zhang TX Schwender H Wu-Chou YH Chen PK Chong SS Cheah F Yeow V Ye X Wang H Huang S Jabs EW Shi B Wilcox AJ Lie RT Jee SH Christensen K Doheny KF Pugh EW Ling H Scott AF 《Genetic epidemiology》2011,35(6):469-478
Nonsyndromic cleft palate (CP) is a common birth defect with a complex and heterogeneous etiology involving both genetic and environmental risk factors. We conducted a genome-wide association study (GWAS) using 550 case-parent trios, ascertained through a CP case collected in an international consortium. Family-based association tests of single nucleotide polymorphisms (SNP) and three common maternal exposures (maternal smoking, alcohol consumption, and multivitamin supplementation) were used in a combined 2 df test for gene (G) and gene-environment (G × E) interaction simultaneously, plus a separate 1 df test for G × E interaction alone. Conditional logistic regression models were used to estimate effects on risk to exposed and unexposed children. While no SNP achieved genome-wide significance when considered alone, markers in several genes attained or approached genome-wide significance when G × E interaction was included. Among these, MLLT3 and SMC2 on chromosome 9 showed multiple SNPs resulting in an increased risk if the mother consumed alcohol during the peri-conceptual period (3 months prior to conception through the first trimester). TBK1 on chr. 12 and ZNF236 on chr. 18 showed multiple SNPs associated with higher risk of CP in the presence of maternal smoking. Additional evidence of reduced risk due to G × E interaction in the presence of multivitamin supplementation was observed for SNPs in BAALC on chr. 8. These results emphasize the need to consider G × E interaction when searching for genes influencing risk to complex and heterogeneous disorders, such as nonsyndromic CP. 相似文献
3.
The large number of markers considered in a genome‐wide association study (GWAS) has resulted in a simplification of analyses conducted. Most studies are analyzed one marker at a time using simple tests like the trend test. Methods that account for the special features of genetic association studies, yet remain computationally feasible for genome‐wide analysis, are desirable as they may lead to increased power to detect associations. Haplotype sharing attempts to translate between population genetics and genetic epidemiology. Near a recent mutation that increases disease risk, haplotypes of case participants should be more similar to each other than haplotypes of control participants; conversely, the opposite pattern may be found near a recent mutation that lowers disease risk. We give computationally simple association tests based on haplotype sharing that can be easily applied to GWASs while allowing use of fast (but not likelihood‐based) haplotyping algorithms and properly accounting for the uncertainty introduced by using inferred haplotypes. We also give haplotype‐sharing analyses that adjust for population stratification. Applying our methods to a GWAS of Parkinson's disease, we find a genome‐wide significant signal in the CAST gene that is not found by single‐SNP methods. Further, a missing‐data artifact that causes a spurious single‐SNP association on chromosome 9 does not impact our test. Genet. Epidemiol. 33:657–667, 2009. Published 2009 Wiley‐Liss, Inc. 相似文献
4.
Marilyn C. Cornelis Arpana Agrawal John W. Cole Nadia N. Hansel Kathleen C. Barnes Terri H. Beaty Siiri N. Bennett Laura J. Bierut Eric Boerwinkle Kimberly F. Doheny Bjarke Feenstra Eleanor Feingold Myriam Fornage Christopher A. Haiman Emily L. Harris M. Geoffrey Hayes John A. Heit Frank B. Hu Jae H. Kang Cathy C. Laurie Hua Ling Teri A. Manolio Mary L. Marazita Rasika A. Mathias Daniel B. Mirel Justin Paschall Louis R. Pasquale Elizabeth W. Pugh John P. Rice Jenna Udren Rob M. van Dam Xiaojing Wang Janey L. Wiggs Kayleen Williams Kai Yu 《Genetic epidemiology》2010,34(4):364-372
Genome‐wide association studies (GWAS) have emerged as powerful means for identifying genetic loci related to complex diseases. However, the role of environment and its potential to interact with key loci has not been adequately addressed in most GWAS. Networks of collaborative studies involving different study populations and multiple phenotypes provide a powerful approach for addressing the challenges in analysis and interpretation shared across studies. The Gene, Environment Association Studies (GENEVA) consortium was initiated to: identify genetic variants related to complex diseases; identify variations in gene‐trait associations related to environmental exposures; and ensure rapid sharing of data through the database of Genotypes and Phenotypes. GENEVA consists of several academic institutions, including a coordinating center, two genotyping centers and 14 independently designed studies of various phenotypes, as well as several Institutes and Centers of the National Institutes of Health led by the National Human Genome Research Institute. Minimum detectable effect sizes include relative risks ranging from 1.24 to 1.57 and proportions of variance explained ranging from 0.0097 to 0.02. Given the large number of research participants (N>80,000), an important feature of GENEVA is harmonization of common variables, which allow analyses of additional traits. Environmental exposure information available from most studies also enables testing of gene‐environment interactions. Facilitated by its sizeable infrastructure for promoting collaboration, GENEVA has established a unified framework for genotyping, data quality control, analysis and interpretation. By maximizing knowledge obtained through collaborative GWAS incorporating environmental exposure information, GENEVA aims to enhance our understanding of disease etiology, potentially identifying opportunities for intervention. Genet. Epidemiol. 34: 364–372, 2010. © 2010 Wiley‐Liss, Inc. 相似文献
5.
6.
Despite the success of genome-wide association studies, much of the genetic contribution to complex human traits is still unexplained. One potential source of genetic variation that may contribute to this "missing heritability" is that which differs in magnitude and/or direction between males and females, which could result from sexual dimorphism in gene expression. Such sex-differentiated effects are common in model organisms, and are becoming increasingly evident in human complex traits through large-scale male- and female-specific meta-analyses. In this article, we review the methodology for meta-analysis of sex-specific genome-wide association studies, and propose a sex-differentiated test of association with quantitative or dichotomous traits, which allows for heterogeneity of allelic effects between males and females. We perform detailed simulations to compare the power of the proposed sex-differentiated meta-analysis with the more traditional "sex-combined" approach, which is ambivalent to gender. The results of this study highlight only a small loss in power for the sex-differentiated meta-analysis when the allelic effects of the causal variant are the same in males and females. However, over a range of models of heterogeneity in allelic effects between genders, our sex-differentiated meta-analysis strategy offers substantial gains in power, and thus has the potential to discover novel loci contributing effects to complex human traits with existing genome-wide association data. 相似文献
7.
Kengo Nagashima Yasunori Sato Hisashi Noma Chikuma Hamada 《Statistics in medicine》2013,32(27):4838-4858
Powerful array‐based single‐nucleotide polymorphism‐typing platforms have recently heralded a new era in which genome‐wide studies are conducted with increasing frequency. A genetic polymorphism associated with population pharmacokinetics (PK) is typically analyzed using nonlinear mixed‐effect models (NLMM). Applying NLMM to large‐scale data, such as those generated by genome‐wide studies, raises several issues related to the assumption of random effects as follows: (i) computation time: it takes a long time to compute the marginal likelihood; (ii) convergence of iterative calculation: an adaptive Gauss–Hermite quadrature is generally used to estimate NLMM; however, iterative calculations may not converge in complex models; and (iii) random‐effects misspecification leads to slightly inflated type‐I error rates. As an alternative effective approach to resolving these issues, in this article, we propose a generalized estimating equation (GEE) approach for analyzing population PK data. In general, GEE analysis does not account for interindividual variability in PK parameters; therefore, the usual GEE estimators cannot be interpreted straightforwardly, and their validities have not been justified. Here, we propose valid inference methods for using GEE even under conditions of interindividual variability and provide theoretical justifications of the proposed GEE estimators for population PK data. In numerical evaluations by simulations, the proposed GEE approach exhibited high computational speed and stability relative to the NLMM approach. Furthermore, the NLMM analysis was sensitive to the misspecification of the random‐effects distribution, and the proposed GEE inference is valid for any distributional form. We provided an illustration by using data from a genome‐wide pharmacogenomic study of an anticancer drug. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献
8.
Chen SH Sun J Dimitrov L Turner AR Adams TS Meyers DA Chang BL Zheng SL Grönberg H Xu J Hsu FC 《Genetic epidemiology》2008,32(2):152-167
Although genetic factors play an important role in most human diseases, multiple genes or genes and environmental factors may influence individual risk. In order to understand the underlying biological mechanisms of complex diseases, it is important to understand the complex relationships that control the process. In this paper, we consider different perspectives, from each optimization, complexity analysis, and algorithmic design, which allows us to describe a reasonable and applicable computational framework for detecting gene-gene interactions. Accordingly, support vector machine and combinatorial optimization techniques (local search and genetic algorithm) were tailored to fit within this framework. Although the proposed approach is computationally expensive, our results indicate this is a promising tool for the identification and characterization of high order gene-gene and gene-environment interactions. We have demonstrated several advantages of this method, including the strong power for classification, less concern for overfitting, and the ability to handle unbalanced data and achieve more stable models. We would like to make the support vector machine and combinatorial optimization techniques more accessible to genetic epidemiologists, and to promote the use and extension of these powerful approaches. 相似文献
9.
[目的]研究2型糖尿病与脂联素基因SNP+45、SNP+276的相互关系。[方法]采用单因素非条件Logistic回归分析,对上海市浦东新区常住居民中新发糖尿病患者(糖尿病组)及社区正常人群(对照组)各590人进行病例-对照研究。[结果]糖尿病组与对照组比较,脂联素基因SNP+45的3种基因型和等位基因分布的差异无统计学意义(χ2=1.44,P>0.05;χ2=1.35,P>0.05)。脂联素基因SNP+276的3种基因型和等位基因分布的差异有统计学意义(χ2=8.45,P<0.05;χ2=8.99,P<0.05),糖尿病患者T/T基因型多于对照组人群。单因素分析显示,脂联素基因SNP+276与糖尿病的关系有统计学意义。[结论]脂联素基因SNP+276与浦东新区汉族人群2型糖尿病的发病相关,脂联素基因的变异可能在2型糖尿病发病过程中起重要作用。 相似文献
10.
There is a growing recognition that interactions (gene‐gene and gene‐environment) can play an important role in common disease etiology. The development of cost‐effective genotyping technologies has made genome‐wide association studies the preferred tool for searching for loci affecting disease risk. These studies are characterized by a large number of investigated SNPs, and efficient statistical methods are even more important than in classical association studies that are done with a small number of markers. In this article we propose a novel gene‐gene interaction test that is more powerful than classical methods. The increase in power is due to the fact that the proposed method incorporates reasonable constraints in the parameter space. The test for both association and interaction is based on a likelihood ratio statistic that has a x?2 distribution asymptotically. We also discuss the definitions used for “no interaction” and argue that tests for pure interaction are useful in genome‐wide studies, especially when using two‐stage strategies where the analyses in the second stage are done on pairs of loci for which at least one is associated with the trait. Genet. Epidemiol. 33:386–393, 2009. © 2008 Wiley‐Liss, Inc. 相似文献
11.
Identifying gene and environment interaction (G × E) can provide insights into biological networks of complex diseases, identify novel genes that act synergistically with environmental factors, and inform risk prediction. However, despite the fact that hundreds of novel disease‐associated loci have been identified from genome‐wide association studies (GWAS), few G × Es have been discovered. One reason is that most studies are underpowered for detecting these interactions. Several new methods have been proposed to improve power for G × E analysis, but performance varies with scenario. In this article, we present a module‐based approach to integrating various methods that exploits each method's most appealing aspects. There are three modules in our approach: (1) a screening module for prioritizing Single Nucleotide Polymorphisms (SNPs); (2) a multiple comparison module for testing G × E; and (3) a G × E testing module. We combine all three of these modules and develop two novel “cocktail” methods. We demonstrate that the proposed cocktail methods maintain the type I error, and that the power tracks well with the best existing methods, despite that the best methods may be different under various scenarios and interaction models. For GWAS, where the true interaction models are unknown, methods like our “cocktail” methods that are powerful under a wide range of situations are particularly appealing. Broadly speaking, the modular approach is conceptually straightforward and computationally simple. It builds on common test statistics and is easily implemented without additional computational efforts. It also allows for an easy incorporation of new methods as they are developed. Our work provides a comprehensive and powerful tool for devising effective strategies for genome‐wide detection of gene‐environment interactions. 相似文献
12.
Statistical interactions between markers of genetic variation, or gene‐gene interactions, are believed to play an important role in the etiology of many multifactorial diseases and other complex phenotypes. Unfortunately, detecting gene‐gene interactions is extremely challenging due to the large number of potential interactions and ambiguity regarding marker coding and interaction scale. For many data sets, there is insufficient statistical power to evaluate all candidate gene‐gene interactions. In these cases, a global test for gene‐gene interactions may be the best option. Global tests have much greater power relative to multiple individual interaction tests and can be used on subsets of the markers as an initial filter prior to testing for specific interactions. In this paper, we describe a novel global test for gene‐gene interactions, the global epistasis test (GET), that is based on results from random matrix theory. As we show via simulation studies based on previously proposed models for common diseases including rheumatoid arthritis, type 2 diabetes, and breast cancer, our proposed GET method has superior performance characteristics relative to existing global gene‐gene interaction tests. A glaucoma GWAS data set is used to demonstrate the practical utility of the GET method. 相似文献
13.
The nonlinear interaction effect among multiple genetic factors, i.e. epistasis, has been recognized as a key component in understanding the underlying genetic basis of complex human diseases and phenotypic traits. Due to the statistical and computational complexity, most epistasis studies are limited to interactions with an order of two. We developed ViSEN to analyze and visualize epistatic interactions of both two‐way and three‐way. ViSEN not only identifies strong interactions among pairs or trios of genetic attributes, but also provides a global interaction map that shows neighborhood and clustering structures. This visualized information could be very helpful to infer the underlying genetic architecture of complex diseases and to generate plausible hypotheses for further biological validations. ViSEN is implemented in Java and freely available at https://sourceforge.net/projects/visen/ . 相似文献
14.
Hugues Aschard Noah Zaitlen Rulla M. Tamimi Sara Lindström Peter Kraft 《Genetic epidemiology》2013,37(4):323-333
Searching for genetic variants involved in gene‐gene and gene‐environment interactions in large‐scale data raises multiple methodological issues. Many existing methods have focused on the problem of dimensionality, trying to explore the largest number of combinations between risk factors while considering simple interaction models. Despite evidence demonstrating the efficacy of these methods in simulated data, their application in real data has been unsuccessful so far. The classical test of a linear marginal genetic effect has been widely used for agnostic genome‐wide association studies, with the underlying idea that most variants involved in interactions might display marginal effect on the phenotypic mean. Although this approach may allow for the identification of genetic variants involved in interactions in many scenarios, the linear marginal effects of some causal alleles on the phenotypic mean might not be always detectable at genome‐wide significance level. We introduce in this study a general association test for quantitative trait loci that compare the distributions of phenotypic values by genotypic classes as opposed to most standard tests that compare phenotypic means by genotypic classes. Using simulations we show that in presence of interactions, this approach can be more powerful than the standard test of the linear marginal effect, with a gain of power increasing with increasing interaction effect and decreasing frequencies of the interacting exposures. We demonstrate the potential utility of our method on real data by analyzing mammographic density genome‐wide data from the Nurses’ Health Study. 相似文献
15.
A Custom Correlation Coefficient (CCC) Approach for Fast Identification of Multi‐SNP Association Patterns in Genome‐Wide SNPs Data
下载免费PDF全文
![点击此处可从《Genetic epidemiology》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Sharlee Climer Wei Yang Lisa de las Fuentes Victor G. Dávila‐Román C. Charles Gu 《Genetic epidemiology》2014,38(7):610-621
Complex diseases are often associated with sets of multiple interacting genetic factors and possibly with unique sets of the genetic factors in different groups of individuals (genetic heterogeneity). We introduce a novel concept of custom correlation coefficient (CCC) between single nucleotide polymorphisms (SNPs) that address genetic heterogeneity by measuring subset correlations autonomously. It is used to develop a 3‐step process to identify candidate multi‐SNP patterns: (1) pairwise (SNP–SNP) correlations are computed using CCC; (2) clusters of so‐correlated SNPs identified; and (3) frequencies of these clusters in disease cases and controls compared to identify disease‐associated multi‐SNP patterns. This method identified 42 candidate multi‐SNP associations with hypertensive heart disease (HHD), among which one cluster of 22 SNPs (six genes) included 13 in SLC8A1 (aka NCX1, an essential component of cardiac excitation‐contraction coupling) and another of 32 SNPs had 29 from a different segment of SLC8A1. While allele frequencies show little difference between cases and controls, the cluster of 22 associated alleles were found in 20% of controls but no cases and the other in 3% of controls but 20% of cases. These suggest that both protective and risk effects on HHD could be exerted by combinations of variants in different regions of SLC8A1, modified by variants from other genes. The results demonstrate that this new correlation metric identifies disease‐associated multi‐SNP patterns overlooked by commonly used correlation measures. Furthermore, computation time using CCC is a small fraction of that required by other methods, thereby enabling the analyses of large GWAS datasets. 相似文献
16.
Julian Little Julian P. T. Higgins John P. A. Ioannidis David Moher France Gagnon Erik von Elm Muin J. Khoury Barbara Cohen George Davey‐Smith Jeremy Grimshaw Paul Scheet Marta Gwinn Robin E. Williamson Guang Yong Zou Kim Hutchings Candice Y. Johnson Valerie Tait Miriam Wiens Jean Golding Cornelia van Duijn John McLaughlin Andrew Paterson George Wells Isabel Fortier Matthew Freedman Maja Zecevic Richard King Claire Infante‐Rivard Alex Stewart Nick Birkett 《Genetic epidemiology》2009,33(7):581-598
17.
Andrew P. Morris Cecilia M. Lindgren Eleftheria Zeggini Nicholas J. Timpson Timothy M. Frayling Andrew T. Hattersley Mark I. McCarthy 《Genetic epidemiology》2010,34(4):335-343
The ultimate goal of genome‐wide association (GWA) studies is to identify genetic variants contributing effects to complex phenotypes in order to improve our understanding of the biological architecture underlying the trait. One approach to allow us to meet this challenge is to consider more refined sub‐phenotypes of disease, defined by pattern of symptoms, for example, which may be physiologically distinct, and thus may have different underlying genetic causes. The disadvantage of sub‐phenotype analysis is that large disease cohorts are sub‐divided into smaller case categories, thus reducing power to detect association. To address this issue, we have developed a novel test of association within a multinomial regression modeling framework, allowing for heterogeneity of genetic effects between sub‐phenotypes. The modeling framework is extremely flexible, and can be generalized to any number of distinct sub‐phenotypes. Simulations demonstrate the power of the multinomial regression‐based analysis over existing methods when genetic effects differ between sub‐phenotypes, with minimal loss of power when these effects are homogenous for the unified phenotype. Application of the multinomial regression analysis to a genome‐wide association study of type 2 diabetes, with cases categorized according to body mass index, highlights previously recognized differential mechanisms underlying obese and non‐obese forms of the disease, and provides evidence of a potential novel association that warrants follow‐up in independent replication cohorts. Genet. Epidemiol. 34: 335–343, 2010. © 2009 Wiley‐Liss, Inc. 相似文献
18.
2型糖尿病(diabetes mellitus type 2,T2DM)是一种复杂的代谢性疾病,是21世纪威胁人类健康的主要疾病之一,可导致严重的发病率和死亡率,是以血糖升高,碳水化合物、蛋白质和脂类代谢调节紊乱为特征。胰岛素抵抗是T2DM发病的一个主要因素,有研究显示抵抗素(resistin,RETN)与胰岛素抵抗呈正相关。RETN是脂肪细胞分泌的因子之一,并且在T2DM、肥胖介导的炎症和胰岛素抵抗中起决定性因素。随着研究的不断探索,抵抗素基因的多态性逐渐被认识,充分理解抵抗素基因在T2DM中的多态性表达,有助于更加合理的将其应用于T2DM的诊治及预防中。 相似文献
19.
We recently proposed a bias correction approach to evaluate accurate estimation of the odds ratio (OR) of genetic variants associated with a secondary phenotype, in which the secondary phenotype is associated with the primary disease, based on the original case‐control data collected for the purpose of studying the primary disease. As reported in this communication, we further investigated the type I error probabilities and powers of the proposed approach, and compared the results to those obtained from logistic regression analysis (with or without adjustment for the primary disease status). We performed a simulation study based on a frequency‐matching case‐control study with respect to the secondary phenotype of interest. We examined the empirical distribution of the natural logarithm of the corrected OR obtained from the bias correction approach and found it to be normally distributed under the null hypothesis. On the basis of the simulation study results, we found that the logistic regression approaches that adjust or do not adjust for the primary disease status had low power for detecting secondary phenotype associated variants and highly inflated type I error probabilities, whereas our approach was more powerful for identifying the SNP‐secondary phenotype associations and had better‐controlled type I error probabilities. Genet. Epidemiol. 2011. © 2011 Wiley Periodicals, Inc. 35:739‐743, 2011 相似文献
20.
Gauderman WJ 《Genetic epidemiology》2003,25(4):327-338
With the increasing availability of genetic data, many studies of quantitative traits focus on hypotheses related to candidate genes, and also gene-environment (G x E) and gene-gene (G x G) interactions. In a population-based sample, estimates and tests of candidate gene effects can be biased by ethnic confounding, also known as population stratification bias. This paper demonstrates that even a modest degree of ethnic confounding can lead to unacceptably high type I error rates for tests of genetic effects. The parent-offspring trio design is reviewed, and several forms of the quantitative transmission disequilibrium test (QTDT) are summarized. A variation of the QTDT (QTDTM) is described that is based on a linear regression model with multiple intercepts, one per parental mating type. This and other models are expanded to allow testing of G x E and G x G interactions. A method for computing required sample sizes using direct computations is described. Sample size requirements for tests of genetic main effects and G x E and G x G interactions are compared across various QTDT approaches to infer their efficiencies relative to one another. The QTDTM is found to meet or exceed the efficiency of other QTDT approaches. For example, the QTDTM is approximately 3% more efficient than the QTDT of Rabinowitz ([1997] Hum. Hered. 47:342-350) for testing a genetic main effect, but can be as much as twice as efficient for testing G x E interaction, and three times more efficient for testing G x G interaction. 相似文献