首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
For genome‐wide association studies with family‐based designs, we propose a Bayesian approach. We show that standard transmission disequilibrium test and family‐based association test statistics can naturally be implemented in a Bayesian framework, allowing flexible specification of the likelihood and prior odds. We construct a Bayes factor conditional on the offspring phenotype and parental genotype data and then use the data we conditioned on to inform the prior odds for each marker. In the construction of the prior odds, the evidence for association for each single marker is obtained at the population‐level by estimating its genetic effect size by fitting the conditional mean model. Since such genetic effect size estimates are statistically independent of the effect size estimation within the families, the actual data set can inform the construction of the prior odds without any statistical penalty. In contrast to Bayesian approaches that have recently been proposed for genome‐wide association studies, our approach does not require assumptions about the genetic effect size; this makes the proposed method entirely data‐driven. The power of the approach was assessed through simulation. We then applied the approach to a genome‐wide association scan to search for associations between single nucleotide polymorphisms and body mass index in the Childhood Asthma Management Program data. Genet. Epidemiol. 34:569–574, 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

2.
Two main classes of methodology have been developed for addressing the analytical intractability of generalized linear mixed models: likelihood‐based methods and Bayesian methods. Likelihood‐based methods such as the penalized quasi‐likelihood approach have been shown to produce biased estimates especially for binary clustered data with small clusters sizes. More recent methods using adaptive Gaussian quadrature perform well but can be overwhelmed by problems with large numbers of random effects, and efficient algorithms to better handle these situations have not yet been integrated in standard statistical packages. Bayesian methods, although they have good frequentist properties when the model is correct, are known to be computationally intensive and also require specialized code, limiting their use in practice. In this article, we introduce a modification of the hybrid approach of Capanu and Begg, 2011, Biometrics 67 , 371–380, as a bridge between the likelihood‐based and Bayesian approaches by employing Bayesian estimation for the variance components followed by Laplacian estimation for the regression coefficients. We investigate its performance as well as that of several likelihood‐based methods in the setting of generalized linear mixed models with binary outcomes. We apply the methods to three datasets and conduct simulations to illustrate their properties. Simulation results indicate that for moderate to large numbers of observations per random effect, adaptive Gaussian quadrature and the Laplacian approximation are very accurate, with adaptive Gaussian quadrature preferable as the number of observations per random effect increases. The hybrid approach is overall similar to the Laplace method, and it can be superior for data with very sparse random effects. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

3.
Attempts to predict prognosis in cancer patients using high‐dimensional genomic data such as gene expression in tumor tissue can be made difficult by the large number of features and the potential complexity of the relationship between features and the outcome. Integrating prior biological knowledge into risk prediction with such data by grouping genomic features into pathways and networks reduces the dimensionality of the problem and could improve prediction accuracy. Additionally, such knowledge‐based models may be more biologically grounded and interpretable. Prediction could potentially be further improved by allowing for complex nonlinear pathway effects. The kernel machine framework has been proposed as an effective approach for modeling the nonlinear and interactive effects of genes in pathways for both censored and noncensored outcomes. When multiple pathways are under consideration, one may efficiently select informative pathways and aggregate their signals via multiple kernel learning (MKL), which has been proposed for prediction of noncensored outcomes. In this paper, we propose MKL methods for censored survival outcomes. We derive our approach for a general survival modeling framework with a convex objective function and illustrate its application under the Cox proportional hazards and semiparametric accelerated failure time models. Numerical studies demonstrate that the proposed MKL‐based prediction methods work well in finite sample and can potentially outperform models constructed assuming linear effects or ignoring the group knowledge. The methods are illustrated with an application to 2 cancer data sets.  相似文献   

4.
Likelihood‐based approaches, which naturally incorporate left censoring due to limit of detection, are commonly utilized to analyze censored multivariate normal data. However, the maximum likelihood estimator (MLE) typically underestimates variance parameters. The restricted maximum likelihood estimator (REML), which corrects the underestimation of variance parameters, cannot be easily extended to analyze censored multivariate normal data. In the light of the connection between the REML and a Bayesian approach discovered in 1974 by Dr Harville, this paper describes a Bayesian approach to censored multivariate normal data. This Bayesian approach is justified through its link to the REML via Laplace's approximation and its performance is evaluated through a simulation study. We consider the Bayesian approach as a valuable alternative because it yields less biased variance parameter estimates than the MLE, and because a solid REML is technically difficult when data are left censored. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

5.
In this paper, we propose nonlinear distance‐odds models investigating elevated odds around point sources of exposure, under a matched case‐control design where there are subtypes within cases. We consider models analogous to the polychotomous logit models and adjacent‐category logit models for categorical outcomes and extend them to the nonlinear distance‐odds context. We consider multiple point sources as well as covariate adjustments. We evaluate maximum likelihood, profile likelihood, iteratively reweighted least squares, and a hierarchical Bayesian approach using Markov chain Monte Carlo techniques under these distance‐odds models. We compare these methods using an extensive simulation study and show that with multiple parameters and a nonlinear model, Bayesian methods have advantages in terms of estimation stability, precision, and interpretation. We illustrate the methods by analyzing Medicaid claims data corresponding to the pediatric asthma population in Detroit, Michigan, from 2004 to 2006. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

6.
To estimate causal effects of vaccine on post‐infection outcomes, Hudgens and Halloran (2006) defined a post‐infection causal vaccine efficacy estimand VEI based on the principal stratification framework. They also derived closed forms for the maximum likelihood estimators of the causal estimand under some assumptions. Extending their research, we propose a Bayesian approach to estimating the causal vaccine effects on binary post‐infection outcomes. The identifiability of the causal vaccine effect VEI is discussed under different assumptions on selection bias. The performance of the proposed Bayesian method is compared with the maximum likelihood method through simulation studies and two case studies — a clinical trial of a rotavirus vaccine candidate and a field study of pertussis vaccination. For both case studies, the Bayesian approach provided similar inference as the frequentist analysis. However, simulation studies with small sample sizes suggest that the Bayesian approach provides smaller bias and shorter confidence interval length. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

7.
Many meta‐analyses combine results from only a small number of studies, a situation in which the between‐study variance is imprecisely estimated when standard methods are applied. Bayesian meta‐analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta‐analysis using data augmentation, in which we represent an informative conjugate prior for between‐study variance by pseudo data and use meta‐regression for estimation. To assist in this, we derive predictive inverse‐gamma distributions for the between‐study variance expected in future meta‐analyses. These may serve as priors for heterogeneity in new meta‐analyses. In a simulation study, we compare approximate Bayesian methods using meta‐regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta‐regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta‐analysis is described. The proposed method facilitates Bayesian meta‐analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.  相似文献   

8.
Numerous meta‐analyses in healthcare research combine results from only a small number of studies, for which the variance representing between‐study heterogeneity is estimated imprecisely. A Bayesian approach to estimation allows external evidence on the expected magnitude of heterogeneity to be incorporated. The aim of this paper is to provide tools that improve the accessibility of Bayesian meta‐analysis. We present two methods for implementing Bayesian meta‐analysis, using numerical integration and importance sampling techniques. Based on 14 886 binary outcome meta‐analyses in the Cochrane Database of Systematic Reviews, we derive a novel set of predictive distributions for the degree of heterogeneity expected in 80 settings depending on the outcomes assessed and comparisons made. These can be used as prior distributions for heterogeneity in future meta‐analyses. The two methods are implemented in R, for which code is provided. Both methods produce equivalent results to standard but more complex Markov chain Monte Carlo approaches. The priors are derived as log‐normal distributions for the between‐study variance, applicable to meta‐analyses of binary outcomes on the log odds‐ratio scale. The methods are applied to two example meta‐analyses, incorporating the relevant predictive distributions as prior distributions for between‐study heterogeneity. We have provided resources to facilitate Bayesian meta‐analysis, in a form accessible to applied researchers, which allow relevant prior information on the degree of heterogeneity to be incorporated. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.  相似文献   

9.
Information from historical trials is important for the design, interim monitoring, analysis, and interpretation of clinical trials. Meta‐analytic models can be used to synthesize the evidence from historical data, which are often only available in aggregate form. We consider evidence synthesis methods for trials with recurrent event endpoints, which are common in many therapeutic areas. Such endpoints are typically analyzed by negative binomial regression. However, the individual patient data necessary to fit such a model are usually unavailable for historical trials reported in the medical literature. We describe approaches for back‐calculating model parameter estimates and their standard errors from available summary statistics with various techniques, including approximate Bayesian computation. We propose to use a quadratic approximation to the log‐likelihood for each historical trial based on 2 independent terms for the log mean rate and the log of the dispersion parameter. A Bayesian hierarchical meta‐analysis model then provides the posterior predictive distribution for these parameters. Simulations show this approach with back‐calculated parameter estimates results in very similar inference as using parameter estimates from individual patient data as an input. We illustrate how to design and analyze a new randomized placebo‐controlled exacerbation trial in severe eosinophilic asthma using data from 11 historical trials.  相似文献   

10.
For bivariate meta‐analysis of diagnostic studies, likelihood approaches are very popular. However, they often run into numerical problems with possible non‐convergence. In addition, the construction of confidence intervals is controversial. Bayesian methods based on Markov chain Monte Carlo (MCMC) sampling could be used, but are often difficult to implement, and require long running times and diagnostic convergence checks. Recently, a new Bayesian deterministic inference approach for latent Gaussian models using integrated nested Laplace approximations (INLA) has been proposed. With this approach MCMC sampling becomes redundant as the posterior marginal distributions are directly and accurately approximated. By means of a real data set we investigate the influence of the prior information provided and compare the results obtained by INLA, MCMC, and the maximum likelihood procedure SAS PROC NLMIXED . Using a simulation study we further extend the comparison of INLA and SAS PROC NLMIXED by assessing their performance in terms of bias, mean‐squared error, coverage probability, and convergence rate. The results indicate that INLA is more stable and gives generally better coverage probabilities for the pooled estimates and less biased estimates of variance parameters. The user‐friendliness of INLA is demonstrated by documented R‐code. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

11.
Increasingly multiple outcomes are collected in order to characterize treatment effectiveness or to evaluate the impact of large policy initiatives. Often the multiple outcomes are non‐commensurate, e.g. measured on different scales. The common approach to inference is to model each outcome separately ignoring the potential correlation among the responses. We describe and contrast several full likelihood and quasi‐likelihood multivariate methods for non‐commensurate outcomes. We present a new multivariate model to analyze binary and continuous correlated outcomes using a latent variable. We study the efficiency gains of the multivariate methods relative to the univariate approach. For complete data, all approaches yield consistent parameter estimates. When the mean structure of all outcomes depends on the same set of covariates, efficiency gains by adopting a multivariate approach are negligible. In contrast, when the mean outcomes depend on different covariate sets, large efficiency gains are realized. Three real examples illustrate the different approaches. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

12.
Objective:  To give guidance in defining probability distributions for model inputs in probabilistic sensitivity analysis (PSA) from a full Bayesian perspective.
Methods:  A common approach to defining probability distributions for model inputs in PSA on the basis of input-related data is to use the likelihood of the data on an appropriate scale as the foundation for the distribution around the inputs. We will look at this approach from a Bayesian perspective, derive the implicit prior distributions in two examples (proportions and relative risks), and compare these to alternative prior distributions.
Results:  In cases where data are sparse (in which case sensitivity analysis is crucial), commonly used approaches can lead to unexpected results. Weshow that this is because of the prior distributions that are implicitly assumed, namely that these are not as "uninformative" or "vague" as believed. We propose priors that we believe are more sensible for two examples and which are just as easy to apply.
Conclusions:  Input probability distributions should not be based on the likelihood of the data, but on the Bayesian posterior distribution calculated from this likelihood and an explicitly stated prior distribution.  相似文献   

13.
In assessing causal mediation effects in randomized studies, a challenge is that the direct and indirect effects can vary across participants due to different measured and unmeasured characteristics. In that case, the population effect estimated from standard approaches implicitly averages over and does not estimate the heterogeneous direct and indirect effects. We propose a Bayesian semiparametric method to estimate heterogeneous direct and indirect effects via clusters, where the clusters are formed by both individual covariate profiles and individual effects due to unmeasured characteristics. These cluster‐specific direct and indirect effects can be estimated through a set of regression models where specific coefficients are clustered by a stick‐breaking prior. To let clustering be appropriately informed by individual direct and indirect effects, we specify a data‐dependent prior. We conduct simulation studies to assess performance of the proposed method compared to other methods. We use this approach to estimate heterogeneous causal direct and indirect effects of an expressive writing intervention for patients with renal cell carcinoma.  相似文献   

14.
The objective of this paper is to illustrate the advantages of the Bayesian approach in quantifying, presenting, and reporting scientific evidence and in assisting decision making. Three basic components in the Bayesian framework are the prior distribution, likelihood function, and posterior distribution. The prior distribution describes analysts' belief a priori, the likelihood function captures how data modify the prior knowledge; and the posterior distribution synthesizes both prior and likelihood information. The Bayesian approach treats the parameters of interest as random variables, uses the entire posterior distribution to quantify the evidence, and reports evidence in a "probabilistic" manner. Two clinical examples are used to demonstrate the value of the Bayesian approach to decision makers. Using either an uninformative or a skeptical prior distribution, these examples show that the Bayesian methods allow calculations of probabilities that are usually of more interest to decision makers, e.g., the probability that treatment A is similar to treatment B, the probability that treatment A is at least 5% better than treatment B, and the probability that treatment A is not within the "similarity region" of treatment B, etc. In addition, the Bayesian approach can deal with multiple endpoints more easily than the classic approach. For example, if decision makers wish to examine mortality and cost jointly, the Bayesian method can report the probability that a treatment achieves at least 2% mortality reduction and less than $20,000 increase in costs. In conclusion, probabilities computed from the Bayesian approach provide more relevant information to decision makers and are easier to interpret.  相似文献   

15.
In systems biology, it is of great interest to identify new genes that were not previously reported to be associated with biological pathways related to various functions and diseases. Identification of these new pathway‐modulating genes does not only promote understanding of pathway regulation mechanisms but also allow identification of novel targets for therapeutics. Recently, biomedical literature has been considered as a valuable resource to investigate pathway‐modulating genes. While the majority of currently available approaches are based on the co‐occurrence of genes within an abstract, it has been reported that these approaches show only sub‐optimal performances because 70% of abstracts contain information only for a single gene. To overcome such limitation, we propose a novel statistical framework based on the concept of ontology fingerprint that uses gene ontology to extract information from large biomedical literature data. The proposed framework simultaneously identifies pathway‐modulating genes and facilitates interpreting functions of these new genes. We also propose a computationally efficient posterior inference procedure based on Metropolis–Hastings within Gibbs sampler for parameter updates and the poor man's reversible jump Markov chain Monte Carlo approach for model selection. We evaluate the proposed statistical framework with simulation studies, experimental validation, and an application to studies of pathway‐modulating genes in yeast. The R implementation of the proposed model is currently available at https://dongjunchung.github.io/bayesGO/ . Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

16.
We consider the inference problem of estimating covariate and genetic effects in a family-based case-control study where families are ascertained on the basis of the number of cases within the family. However, our interest lies not only in estimating the fixed covariate effects but also in estimating the random effects parameters that account for varying correlations among family members. These random effects parameters, though weakly identifiable in a strict theoretical sense, are often hard to estimate due to the small number of observations per family. A hierarchical Bayesian paradigm is a very natural route in this context with multiple advantages compared with a classical mixed effects estimation strategy based on the integrated likelihood. We propose a fully flexible Bayesian approach allowing nonparametric modeling of the random effects distribution using a Dirichlet process prior and provide estimation of both fixed effect and random effects parameters using a Markov chain Monte Carlo numerical integration scheme. The nonparametric Bayesian approach not only provides inference that is less sensitive to parametric specification of the random effects distribution but also allows possible uncertainty around a specific genetic correlation structure. The Bayesian approach has certain computational advantages over its mixed-model counterparts. Data from the Prostate Cancer Genetics Project, a family-based study at the University of Michigan Comprehensive Cancer Center including families having one or more members with prostate cancer, are used to illustrate the proposed methods. A small-scale simulation study is carried out to compare the proposed nonparametric Bayes methodology with a parametric Bayesian alternative.  相似文献   

17.
For analyzing complex trait association with sequencing data, most current studies test aggregated effects of variants in a gene or genomic region. Although gene‐based tests have insufficient power even for moderately sized samples, pathway‐based analyses combine information across multiple genes in biological pathways and may offer additional insight. However, most existing pathway association methods are originally designed for genome‐wide association studies, and are not comprehensively evaluated for sequencing data. Moreover, region‐based rare variant association methods, although potentially applicable to pathway‐based analysis by extending their region definition to gene sets, have never been rigorously tested. In the context of exome‐based studies, we use simulated and real datasets to evaluate pathway‐based association tests. Our simulation strategy adopts a genome‐wide genetic model that distributes total genetic effects hierarchically into pathways, genes, and individual variants, allowing the evaluation of pathway‐based methods with realistic quantifiable assumptions on the underlying genetic architectures. The results show that, although no single pathway‐based association method offers superior performance in all simulated scenarios, a modification of Gene Set Enrichment Analysis approach using statistics from single‐marker tests without gene‐level collapsing (weighted Kolmogrov‐Smirnov [WKS]‐Variant method) is consistently powerful. Interestingly, directly applying rare variant association tests (e.g., sequence kernel association test) to pathway analysis offers a similar power, but its results are sensitive to assumptions of genetic architecture. We applied pathway association analysis to an exome‐sequencing data of the chronic obstructive pulmonary disease, and found that the WKS‐Variant method confirms associated genes previously published.  相似文献   

18.
O'Rourke K  Altman DG 《Statistics in medicine》2005,24(17):2733-42; author reply 2743
In a recent Statistics in Medicine paper, Warn, Thompson and Spiegelhalter (WTS) made a comparison between the Bayesian approach to the meta-analysis of binary outcomes and a popular Classical approach that uses summary (two-stage) techniques. They included approximate summary (two-stage) Bayesian techniques in their comparisons in an attempt undoubtedly to make the comparison less unfair. But, as this letter will argue, there are techniques from the Classical approach that are closer-those based directly on the likelihood-and they failed to make comparisons with these. Here the differences between Bayesian and Classical approaches in meta-analysis applications reside solely in how the likelihood functions are converted into either credibility intervals or confidence intervals. Both summarize, contrast and combine data using likelihood functions. Conflating what Bayes actually offers to meta-analysts-a means of converting likelihood functions to credibility intervals-with the use of likelihood functions themselves to summarize, contrast and combine studies is at best misleading.  相似文献   

19.
It is increasingly recognized that pathway analyses—a joint test of association between the outcome and a group of single nucleotide polymorphisms (SNPs) within a biological pathway—could potentially complement single‐SNP analysis and provide additional insights for the genetic architecture of complex diseases. Building upon existing P‐value combining methods, we propose a class of highly flexible pathway analysis approaches based on an adaptive rank truncated product statistic that can effectively combine evidence of associations over different SNPs and genes within a pathway. The statistical significance of the pathway‐level test statistics is evaluated using a highly efficient permutation algorithm that remains computationally feasible irrespective of the size of the pathway and complexity of the underlying test statistics for summarizing SNP‐ and gene‐level associations. We demonstrate through simulation studies that a gene‐based analysis that treats the underlying genes, as opposed to the underlying SNPs, as the basic units for hypothesis testing, is a very robust and powerful approach to pathway‐based association testing. We also illustrate the advantage of the proposed methods using a study of the association between the nicotinic receptor pathway and cigarette smoking behaviors. Genet. Epidemiol. 33:700–709, 2009. Published 2009 Wiley‐Liss, Inc.  相似文献   

20.
There has been extensive literature on modeling gene‐gene interaction (GGI) and gene‐environment interaction (GEI) in case‐control studies with limited literature on statistical methods for GGI and GEI in longitudinal cohort studies. We borrow ideas from the classical two‐way analysis of variance literature to address the issue of robust modeling of interactions in repeated‐measures studies. While classical interaction models proposed by Tukey and Mandel have interaction structures as a function of main effects, a newer class of models, additive main effects and multiplicative interaction (AMMI) models, do not have similar restrictive assumptions on the interaction structure. AMMI entails a singular value decomposition of the cell residual matrix after fitting the additive main effects and has been shown to perform well across various interaction structures. We consider these models for testing GGI and GEI from two perspectives: likelihood ratio test based on cell means and a regression‐based approach using individual observations. Simulation results indicate that both approaches for AMMI models lead to valid tests in terms of maintaining the type I error rate, with the regression approach having better power properties. The performance of these models was evaluated across different interaction structures and 12 common epistasis patterns. In summary, AMMI model is robust with respect to misspecified interaction structure and is a useful screening tool for interaction even in the absence of main effects. We use the proposed methods to examine the interplay between the hemochromatosis gene and cumulative lead exposure on pulse pressure in the Normative Aging Study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号