首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Multiple linear regression is commonly used to test for association between genetic variants and continuous traits and estimate genetic effect sizes. Confounding variables are controlled for by including them as additional covariates. An alternative technique that is increasingly used is to regress out covariates from the raw trait and then perform regression analysis with only the genetic variants included as predictors. In the case of single-variant analysis, this adjusted trait regression (ATR) technique is known to be less powerful than the traditional technique when the genetic variant is correlated with the covariates We extend previous results for single-variant tests by deriving exact relationships between the single-variant score, Wald, likelihood-ratio, and F test statistics and their ATR analogs. We also derive the asymptotic power of ATR analogs of the multiple-variant score and burden tests. We show that the maximum power loss of the ATR analog of the multiple-variant score test is completely characterized by the canonical correlations between the set of genetic variants and the set of covariates. Further, we show that for both single- and multiple-variant tests, the power loss for ATR analogs increases with increasing stringency of Type 1 error control () and increasing correlation (or canonical correlations) between the genetic variant (or multiple variants) and covariates. We recommend using ATR only when maximum canonical correlation between variants and covariates is low, as is typically true.  相似文献   

2.
In agreement studies, when objects are rated independently by two raters (or twice by the same rater), an association between their ratings on two categories arises, reflecting the distinguishability of these two categories for these raters. When ratings are performed on an ordinal scale, this association between ratings on two categories increases when the distance between these categories increases on the ordinal scale. Goodman's log-linear models derived for the analysis of agreement between two raters on an ordinal scale assume that distinguishabilities between adjacent categories are either constant, or a priori fixed. Log-non-linear models that allow variations of the distinguishabilities between adjacent categories along the scale, may lead to difficulties in parameter estimation.This paper describes a new class of log-linear non-uniform association models. These models extend the log-linear uniform association model by allowing variations of distinguishability between adjacent categories (along the scale). These new models are used to analyse ordinal agreement between dermatologists when assessing the severity of different cutaneous signs of ageing on women faces.  相似文献   

3.
The ultimate goal of genome‐wide association (GWA) studies is to identify genetic variants contributing effects to complex phenotypes in order to improve our understanding of the biological architecture underlying the trait. One approach to allow us to meet this challenge is to consider more refined sub‐phenotypes of disease, defined by pattern of symptoms, for example, which may be physiologically distinct, and thus may have different underlying genetic causes. The disadvantage of sub‐phenotype analysis is that large disease cohorts are sub‐divided into smaller case categories, thus reducing power to detect association. To address this issue, we have developed a novel test of association within a multinomial regression modeling framework, allowing for heterogeneity of genetic effects between sub‐phenotypes. The modeling framework is extremely flexible, and can be generalized to any number of distinct sub‐phenotypes. Simulations demonstrate the power of the multinomial regression‐based analysis over existing methods when genetic effects differ between sub‐phenotypes, with minimal loss of power when these effects are homogenous for the unified phenotype. Application of the multinomial regression analysis to a genome‐wide association study of type 2 diabetes, with cases categorized according to body mass index, highlights previously recognized differential mechanisms underlying obese and non‐obese forms of the disease, and provides evidence of a potential novel association that warrants follow‐up in independent replication cohorts. Genet. Epidemiol. 34: 335–343, 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

4.
Multivariate phenotypes are frequently encountered in genome‐wide association studies (GWAS). Such phenotypes contain more information than univariate phenotypes, but how to best exploit the information to increase the chance of detecting genetic variant of pleiotropic effect is not always clear. Moreover, when multivariate phenotypes contain a mixture of quantitative and qualitative measures, limited methods are applicable. In this paper, we first evaluated the approach originally proposed by O'Brien and by Wei and Johnson that combines the univariate test statistics and then we proposed two extensions to that approach. The original and proposed approaches are applicable to a multivariate phenotype containing any type of components including continuous, categorical and survival phenotypes, and applicable to samples consisting of families or unrelated samples. Simulation results suggested that all methods had valid type I error rates. Our extensions had a better power than O'Brien's method with heterogeneous means among univariate test statistics, but were less powerful than O'Brien's with homogeneous means among individual test statistics. All approaches have shown considerable increase in power compared to testing each component of a multivariate phenotype individually in some cases. We apply all the methods to GWAS of serum uric acid levels and gout with 550,000 single nucleotide polymorphisms in the Framingham Heart Study. Genet. Epidemiol. 34:444–454, 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

5.
Populations of non-European ancestry are substantially underrepresented in genome-wide association studies (GWAS). As genetic effects can differ between ancestries due to possibly different causal variants or linkage disequilibrium patterns, a meta-analysis that includes GWAS of all populations yields biased estimation in each of the populations and the bias disproportionately impacts non-European ancestry populations. This is because meta-analysis combines study-specific estimates with inverse variance as the weights, which causes biases towards studies with the largest sample size, typical of the European ancestry population. In this paper, we propose two empirical Bayes (EB) estimators to borrow the strength of information across populations although accounting for between-population heterogeneity. Extensive simulation studies show that the proposed EB estimators are largely unbiased and improve efficiency compared to the population-specific estimator. In contrast, even though the meta-analysis estimator has a much smaller variance, it yields significant bias when the genetic effect is heterogeneous across populations. We apply the proposed EB estimators to a large-scale trans-ancestry GWAS of stroke and demonstrate that the EB estimators reduce the variance of the population-specific estimator substantially, with the effect estimates close to the population-specific estimates.  相似文献   

6.
Researchers often encounter longitudinal health data characterized with three or more ordinal or nominal categories. Random‐effects multinomial logit models are generally applied to account for potential lack of independence inherent in such clustered data. When parameter estimates are used to describe longitudinal processes, however, random effects, both between and within individuals, need to be retransformed for correctly predicting outcome probabilities. This study attempts to go beyond existing work by developing a retransformation method that derives longitudinal growth trajectories of unbiased health probabilities. We estimated variances of the predicted probabilities by using the delta method. Additionally, we transformed the covariates’ regression coefficients on the multinomial logit function, not substantively meaningful, to the conditional effects on the predicted probabilities. The empirical illustration uses the longitudinal data from the Asset and Health Dynamics among the Oldest Old. Our analysis compared three sets of the predicted probabilities of three health states at six time points, obtained from, respectively, the retransformation method, the best linear unbiased prediction, and the fixed‐effects approach. The results demonstrate that neglect of retransforming random errors in the random‐effects multinomial logit model results in severely biased longitudinal trajectories of health probabilities as well as overestimated effects of covariates on the probabilities. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

7.
With the rapid development of modern genotyping technology, it is becoming commonplace to genotype densely spaced genetic markers such as single nucleotide polymorphisms (SNPs) along the genome. This development has inspired a strong interest in using multiple markers located in the target region for the detection of association. We introduce a principal components (PCs) regression method for candidate gene association studies where multiple SNPs from the candidate region tend to be correlated. In this approach, the total variance in the original genotype scores is decomposed into parts that correspond to uncorrelated PCs. The PCs with the largest variances are then used as regressors in a multiple regression. Simulation studies suggest that this approach can have higher power than some popular methods. An application to CHI3L2 gene expression data confirms a significant association between CHI3L2 gene expression level and SNPs from this gene that has been previously reported by others.  相似文献   

8.
This paper proposes a risk prediction model using semi‐varying coefficient multinomial logistic regression. We use a penalized local likelihood method to do the model selection and estimate both functional and constant coefficients in the selected model. The model can be used to improve predictive modelling when non‐linear interactions between predictors are present. We conduct a simulation study to assess our method's performance, and the results show that the model selection procedure works well with small average numbers of wrong‐selection or missing‐selection. We illustrate the use of our method by applying it to classify the patients with early rheumatoid arthritis at baseline into different risk groups in future disease progression. We use a leave‐one‐out cross‐validation method to assess its correct prediction rate and propose a recalibration framework to evaluate how reliable are the predicted risks. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

9.
The case‐control study is a common design for assessing the association between genetic exposures and a disease phenotype. Though association with a given (case‐control) phenotype is always of primary interest, there is often considerable interest in assessing relationships between genetic exposures and other (secondary) phenotypes. However, the case‐control sample represents a biased sample from the general population. As a result, if this sampling framework is not correctly taken into account, analyses estimating the effect of exposures on secondary phenotypes can be biased leading to incorrect inference. In this paper, we address this problem and propose a general approach for estimating and testing the population effect of a genetic variant on a secondary phenotype. Our approach is based on inverse probability weighted estimating equations, where the weights depend on genotype and the secondary phenotype. We show that, though slightly less efficient than a full likelihood‐based analysis when the likelihood is correctly specified, it is substantially more robust to model misspecification, and can out‐perform likelihood‐based analysis, both in terms of validity and power, when the model is misspecified. We illustrate our approach with an application to a case‐control study extracted from the Framingham Heart Study. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

10.
[导读]探讨基于基因水平的核函数logistic回归模型及其在全基因组关联研究中的应用.以全基因组关联研究模拟数据为例,介绍核函数logistic回归模型在基因水平检测遗传变异与复杂性疾病之间关联的分析策略.模拟结果表明,在所有已知基因检验结果中致病位点所在基因假设检验的P值最小.结果提示基于基因水平的核函数logistic回归模型能够充分提取和综合基因中多个遗传突变位点信息,降低统计学检验的自由度,同时还能够控制多种协变量因素和交互作用,在检测致病基因与疾病关联时具有一定的效能.  相似文献   

11.
Lipid levels in blood are widely used to diagnose and monitor chronic diseases. It is essential to identify the genetic traits involved in lipid metabolism for understanding chronic diseases. However, the influence of genetic traits varies depending on race, sex, age, and ethnicity. Therefore, research focusing on populations of individual countries is required, and the results can be used as a basis for comparison of results of other studies at the cross-racial and cross-country levels. In the present study, we selected lipid-related variants and evaluated their effects on lipid-related diseases in more than 14,000 subjects of three cohorts using the Illumina Human Exome Beadchip. A genome-wide association study was conducted using EPACTs after adjusting for age, sex, and recruitment area. A genome-wide significance cutoff was defined as p < 5E−08 in all the three cohorts. Sixteen variants represented the lipid traits and were classified as vulnerable to borderline hypertriglyceridemia, hyper-LDL-cholesterolemia, or hypo-HDL-cholesterolemia. Moreover, we compared the genetic effects of the 16 variants between ethnic groups and identified the missense variants in apolipoprotein A-V, cholesterol ester transfer protein, and apolipoprotein E as Asian-specific. Our study provides candidate genes as markers for chronic diseases through the evaluation of genetic effects.  相似文献   

12.
Many genetic epidemiological studies collect repeated measurements over time. This design not only provides a more accurate assessment of disease condition, but allows us to explore the genetic influence on disease development and progression. Thus, it is of great interest to study the longitudinal contribution of genes to disease susceptibility. Most association testing methods for longitudinal phenotypes are developed for single variant, and may have limited power to detect association, especially for variants with low minor allele frequency. We propose Longitudinal SNP‐set/sequence kernel association test (LSKAT), a robust, mixed‐effects method for association testing of rare and common variants with longitudinal quantitative phenotypes. LSKAT uses several random effects to account for the within‐subject correlation in longitudinal data, and allows for adjustment for both static and time‐varying covariates. We also present a longitudinal trait burden test (LBT), where we test association between the trait and the burden score in linear mixed models. In simulation studies, we demonstrate that LBT achieves high power when variants are almost all deleterious or all protective, while LSKAT performs well in a wide range of genetic models. By making full use of trait values from repeated measures, LSKAT is more powerful than several tests applied to a single measurement or average over all time points. Moreover, LSKAT is robust to misspecification of the covariance structure. We apply the LSKAT and LBT methods to detect association with longitudinally measured body mass index in the Framingham Heart Study, where we are able to replicate association with a circadian gene NR1D2.  相似文献   

13.
探讨基于基因水平的主成分logistic回归模型分析方法及其在全基因组关联研究中的应用.以全基因组关联研究基因型模拟数据为例,介绍基于主成分的logistic回归模型在基因水平检测遗传变异与复杂性疾病之间关联的分析策略.模拟结果表明致病位点所在基因假设检验的P值在所有基因检验结果中为最小.研究结果提示在全基因组关联研究中,采用基于基因水平的主成分logistic回归模型一方面能够降低检验的自由度,另一方面能够处理单核苷酸多态性之间相关性问题,在检测致病基因与疾病关联时具有一定的效能.  相似文献   

14.
《“健康中国2030”规划纲要》强调积极推动主动健康以应对人口老龄化挑战,对个体与群体层面的抗衰老预防控制策略提出更高要求。衰老受遗传因素和环境暴露的共同影响。近20年来,随着高通量遗传检测技术和相关算法的发展,以及高质量大规模人群基因组研究的开展,极大促进了对衰老遗传学的研究。本文旨在对衰老相关的全基因组关联研究展开概述。  相似文献   

15.
近年来,遗传关联性研究的Meta分析受到越来越多的学者关注。设计遗传关联性研究的Meta分析时,传统做法是将各基因模型的结果全部计算出来,这样不仅增加了假阳性结果的概率,也使得Meta分析的结果难以进一步分析。因此,在设计遗传关联性研究的Meta分析时,一个重要的步骤是如何选择恰当的基因遗传模型。本文旨在介绍贝叶斯无基因模型法的原理,以期帮助读者在设计遗传关联性研究的Meta分析时应用此方法。  相似文献   

16.
BackgroundCharacterizing the experience and impact of the COVID-19 pandemic among various populations remains challenging due to the limitations inherent in common data sources, such as electronic health records (EHRs) or cross-sectional surveys.ObjectiveThis study aims to describe testing behaviors, symptoms, impact, vaccination status, and case ascertainment during the COVID-19 pandemic using integrated data sources.MethodsIn summer 2020 and 2021, we surveyed participants enrolled in the Biobank at the Colorado Center for Personalized Medicine (CCPM; N=180,599) about their experience with COVID-19. The prevalence of testing, symptoms, and impacts of COVID-19 on employment, family life, and physical and mental health were calculated overall and by demographic categories. Survey respondents who reported receiving a positive COVID-19 test result were considered a “confirmed case” of COVID-19. Using EHRs, we compared COVID-19 case ascertainment and characteristics in EHRs versus the survey. Positive cases were identified in EHRs using the International Statistical Classification of Diseases, 10th revision (ICD-10) diagnosis codes, health care encounter types, and encounter primary diagnoses.ResultsOf the 25,063 (13.9%) survey respondents, 10,661 (42.5%) had been tested for COVID-19, and of those, 1366 (12.8%) tested positive. Nearly half of those tested had symptoms or had been exposed to someone who was infected. Young adults (18-29 years) and Hispanics were more likely to have positive tests compared to older adults and persons of other racial/ethnic groups. Mental health (n=13,688, 54.6%) and family life (n=12,233, 48.8%) were most negatively affected by the pandemic and more so among younger groups and women; negative impacts on employment were more commonly reported among Black respondents. Of the 10,249 individuals who responded to vaccination questions from version 2 of the survey (summer 2021), 9770 (95.3%) had received the vaccine. After integration with EHR data up to the time of the survey completion, 1006 (4%) of the survey respondents had a discordant COVID-19 case status between EHRs and the survey. Using all longitudinal EHR and survey data, we identified 11,472 (6.4%) COVID-19-positive cases among Biobank participants. In comparison to COVID-19 cases identified through the survey, EHR-identified cases were younger and more likely to be Hispanic.ConclusionsWe found that the COVID-19 pandemic has had far-reaching and varying effects among our Biobank participants. Integrated data assets, such as the Biobank at the CCPM, are key resources for population health monitoring in response to public health emergencies, such as the COVID-19 pandemic.  相似文献   

17.
18.
Vitamin D has been intensively studied for its association with human health, but the scope of such association and the causal role of vitamin D remain controversial. We aim to comprehensively investigate the links between vitamin D and human health through both epidemiological and Mendelian randomization (MR) analyses. We examined the epidemiological associations between serum 25‐hydroxyvitamin D (25(OH)D) concentration and 90 diseases/traits in 326,409 UK Biobank (UKBB) Europeans. The causal relations between 25(OH)D and 106 diseases/traits were investigated by performing MR analysis using genome‐wide significant 25(OH)D‐associated variants (N = 143) from the largest UKBB GWAS to date. In epidemiological analysis, we found 25(OH)D was associated with 45 diseases/traits across cardiovascular/metabolic diseases, psychiatric/neurological diseases, autoimmune/inflammatory diseases, cancer, musculoskeletal diseases, and quantitative traits. In MR‐analysis, we presented evidence suggesting potential causal role of 25(OH)D in increasing height (β = .064, 95% confidence interval [CI] = 0.019–0.11) and preventing the risk of ovarian cancer (odds ratio [OR] = 0.96, 95% CI = 0.93–0.99), multiple sclerosis (OR = 0.96, 95% CI = 0.94–0.98), leg fracture (OR = 0.60, 95% CI = 0.45–0.80) and femur fracture (OR = 0.53, 95% CI = 0.32–0.84). These findings confirmed associations of vitamin D with a broad spectrum of diseases/traits and supported the potential causal role of vitamin D in promoting health.  相似文献   

19.
A longitudinal data set, from the Finnish Otitis Media (FinOM) Studies, reporting carriage or non-carriage of Streptococcus pneumoniae at 2, 3, 4, 5, 6, 9, 12, 15 and 18 months of age of 329 children living in Tampere, Finland, is analysed. A logistic regression model on five time varying explanatory variables is fitted. The temporal association between presence at different ages is measured by dependence ratios and the structure of these is shown to be well described by a model indicating that roughly 10 per cent of the children are not susceptible to the bacteria, while for those that are susceptible, carriage status at a future observation age is conditionally independent of past observed statuses, given the present status. The dependence ratios between carriage at adjacent observation ages decay exponentially with age. Maximum likelihood estimates are obtained for the parameters of the full model, which is the combination of the marginal logistic regression and the association models. The parameter estimates of the full model, strengthened by non-testable Markov assumptions, are used for assessing the median duration of carriage and the acquisition rate as functions of age.  相似文献   

20.
Hypertension and hypertensive heart disease (HHD) are inter-related phenotypes frequently observed with other comorbidities such as diabetes, obesity, and dyslipidemia, which probably reflect the complex gene-gene and/or gene-environment interactions resulting in HHD. The complexity of HHD led us to examine intermediate phenotypes (e.g., echocardiographically-derived measures) for simpler clues to the genetic underpinnings of the disease. We applied the method of independent component analysis to a prospective study of the metabolic predictors of left ventricular hypertrophy and extracted latent traits of HHD from panels of multi-dimensional anthropomorphic, hemodynamic echocardiographic and metabolic data. Based on the latent trait values, classification of subjects into different risk groups for HHD captured meaningful subtypes of the disease as reflected in the distributions of primary clinical indicators. Furthermore, we detected genetic associations of the latent HHD traits with single nucleotide polymorphisms in three candidate genes in the peroxisome proliferator-activated receptors complex, for which no significant association was found with the original clinical indicators of HHD. Consensus analysis of the results from repeated independent component analysis runs showed satisfactory robustness and estimated about 3-4 separate unseen sources for the observed HHD-related outcomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号