首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A variable is ‘systematically missing’ if it is missing for all individuals within particular studies in an individual participant data meta‐analysis. When a systematically missing variable is a potential confounder in observational epidemiology, standard methods either fail to adjust the exposure–disease association for the potential confounder or exclude studies where it is missing. We propose a new approach to adjust for systematically missing confounders based on multiple imputation by chained equations. Systematically missing data are imputed via multilevel regression models that allow for heterogeneity between studies. A simulation study compares various choices of imputation model. An illustration is given using data from eight studies estimating the association between carotid intima media thickness and subsequent risk of cardiovascular events. Results are compared with standard methods and also with an extension of a published method that exploits the relationship between fully adjusted and partially adjusted estimated effects through a multivariate random effects meta‐analysis model. We conclude that multiple imputation provides a practicable approach that can handle arbitrary patterns of systematic missingness. Bias is reduced by including sufficient between‐study random effects in the imputation model. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

2.
Individual participant data meta‐analyses (IPD‐MA) are increasingly used for developing and validating multivariable (diagnostic or prognostic) risk prediction models. Unfortunately, some predictors or even outcomes may not have been measured in each study and are thus systematically missing in some individual studies of the IPD‐MA. As a consequence, it is no longer possible to evaluate between‐study heterogeneity and to estimate study‐specific predictor effects, or to include all individual studies, which severely hampers the development and validation of prediction models. Here, we describe a novel approach for imputing systematically missing data and adopt a generalized linear mixed model to allow for between‐study heterogeneity. This approach can be viewed as an extension of Resche‐Rigon's method (Stat Med 2013), relaxing their assumptions regarding variance components and allowing imputation of linear and nonlinear predictors. We illustrate our approach using a case study with IPD‐MA of 13 studies to develop and validate a diagnostic prediction model for the presence of deep venous thrombosis. We compare the results after applying four methods for dealing with systematically missing predictors in one or more individual studies: complete case analysis where studies with systematically missing predictors are removed, traditional multiple imputation ignoring heterogeneity across studies, stratified multiple imputation accounting for heterogeneity in predictor prevalence, and multilevel multiple imputation (MLMI) fully accounting for between‐study heterogeneity. We conclude that MLMI may substantially improve the estimation of between‐study heterogeneity parameters and allow for imputation of systematically missing predictors in IPD‐MA aimed at the development and validation of prediction models. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

3.
Multiple imputation is a strategy for the analysis of incomplete data such that the impact of the missingness on the power and bias of estimates is mitigated. When data from multiple studies are collated, we can propose both within‐study and multilevel imputation models to impute missing data on covariates. It is not clear how to choose between imputation models or how to combine imputation and inverse‐variance weighted meta‐analysis methods. This is especially important as often different studies measure data on different variables, meaning that we may need to impute data on a variable which is systematically missing in a particular study. In this paper, we consider a simulation analysis of sporadically missing data in a single covariate with a linear analysis model and discuss how the results would be applicable to the case of systematically missing data. We find in this context that ensuring the congeniality of the imputation and analysis models is important to give correct standard errors and confidence intervals. For example, if the analysis model allows between‐study heterogeneity of a parameter, then we should incorporate this heterogeneity into the imputation model to maintain the congeniality of the two models. In an inverse‐variance weighted meta‐analysis, we should impute missing data and apply Rubin's rules at the study level prior to meta‐analysis, rather than meta‐analyzing each of the multiple imputations and then combining the meta‐analysis estimates using Rubin's rules. We illustrate the results using data from the Emerging Risk Factors Collaboration. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.  相似文献   

4.
Multiple imputation is a popular method for addressing missing data, but its implementation is difficult when data have a multilevel structure and one or more variables are systematically missing. This systematic missing data pattern may commonly occur in meta‐analysis of individual participant data, where some variables are never observed in some studies, but are present in other hierarchical data settings. In these cases, valid imputation must account for both relationships between variables and correlation within studies. Proposed methods for multilevel imputation include specifying a full joint model and multiple imputation with chained equations (MICE). While MICE is attractive for its ease of implementation, there is little existing work describing conditions under which this is a valid alternative to specifying the full joint model. We present results showing that for multilevel normal models, MICE is rarely exactly equivalent to joint model imputation. Through a simulation study and an example using data from a traumatic brain injury study, we found that in spite of theoretical differences, MICE imputations often produce results similar to those obtained using the joint model. We also assess the influence of prior distributions in MICE imputation methods and find that when missingness is high, prior choices in MICE models tend to affect estimation of across‐study variability more than compatibility of conditional likelihoods. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

5.
There are many advantages to individual participant data meta‐analysis for combining data from multiple studies. These advantages include greater power to detect effects, increased sample heterogeneity, and the ability to perform more sophisticated analyses than meta‐analyses that rely on published results. However, a fundamental challenge is that it is unlikely that variables of interest are measured the same way in all of the studies to be combined. We propose that this situation can be viewed as a missing data problem in which some outcomes are entirely missing within some trials and use multiple imputation to fill in missing measurements. We apply our method to five longitudinal adolescent depression trials where four studies used one depression measure and the fifth study used a different depression measure. None of the five studies contained both depression measures. We describe a multiple imputation approach for filling in missing depression measures that makes use of external calibration studies in which both depression measures were used. We discuss some practical issues in developing the imputation model including taking into account treatment group and study. We present diagnostics for checking the fit of the imputation model and investigate whether external information is appropriately incorporated into the imputed values. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

6.
Meta‐analytic methods for combining data from multiple intervention trials are commonly used to estimate the effectiveness of an intervention. They can also be extended to study comparative effectiveness, testing which of several alternative interventions is expected to have the strongest effect. This often requires network meta‐analysis (NMA), which combines trials involving direct comparison of two interventions within the same trial and indirect comparisons across trials. In this paper, we extend existing network methods for main effects to examining moderator effects, allowing for tests of whether intervention effects vary for different populations or when employed in different contexts. In addition, we study how the use of individual participant data may increase the sensitivity of NMA for detecting moderator effects, as compared with aggregate data NMA that employs study‐level effect sizes in a meta‐regression framework. A new NMA diagram is proposed. We also develop a generalized multilevel model for NMA that takes into account within‐trial and between‐trial heterogeneity and can include participant‐level covariates. Within this framework, we present definitions of homogeneity and consistency across trials. A simulation study based on this model is used to assess effects on power to detect both main and moderator effects. Results show that power to detect moderation is substantially greater when applied to individual participant data as compared with study‐level effects. We illustrate the use of this method by applying it to data from a classroom‐based randomized study that involved two sub‐trials, each comparing interventions that were contrasted with separate control groups. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

7.
One difficulty in performing meta‐analyses of observational cohort studies is that the availability of confounders may vary between cohorts, so that some cohorts provide fully adjusted analyses while others only provide partially adjusted analyses. Commonly, analyses of the association between an exposure and disease either are restricted to cohorts with full confounder information, or use all cohorts but do not fully adjust for confounding. We propose using a bivariate random‐effects meta‐analysis model to use information from all available cohorts while still adjusting for all the potential confounders. Our method uses both the fully adjusted and the partially adjusted estimated effects in the cohorts with full confounder information, together with an estimate of their within‐cohort correlation. The method is applied to estimate the association between fibrinogen level and coronary heart disease incidence using data from 154 012 participants in 31 cohorts.? One hundred and ninety‐nine participants from the original 154 211 withdrew their consent and have been removed from this analysis. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

8.
Recently, multiple imputation has been proposed as a tool for individual patient data meta‐analysis with sporadically missing observations, and it has been suggested that within‐study imputation is usually preferable. However, such within study imputation cannot handle variables that are completely missing within studies. Further, if some of the contributing studies are relatively small, it may be appropriate to share information across studies when imputing. In this paper, we develop and evaluate a joint modelling approach to multiple imputation of individual patient data in meta‐analysis, with an across‐study probability distribution for the study specific covariance matrices. This retains the flexibility to allow for between‐study heterogeneity when imputing while allowing (i) sharing information on the covariance matrix across studies when this is appropriate, and (ii) imputing variables that are wholly missing from studies. Simulation results show both equivalent performance to the within‐study imputation approach where this is valid, and good results in more general, practically relevant, scenarios with studies of very different sizes, non‐negligible between‐study heterogeneity and wholly missing variables. We illustrate our approach using data from an individual patient data meta‐analysis of hypertension trials. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.  相似文献   

9.
There has been a recent growth in developments of multivariate meta‐analysis. We extend the methodology of Bayesian multivariate meta‐analysis to the situation when there are more than two outcomes of interest, which is underexplored in the current literature. Our objective is to meta‐analyse summary data from multiple outcomes simultaneously, accounting for potential dependencies among the data. One common issue is that studies do not all report all of the outcomes of interests, and we take an approach relying on marginal modelling of only the reported data. We employ a separation prior for the between‐study variance–covariance matrix, which offers an improvement on the conventional inverse‐Wishart prior, showing robustness in estimation and flexibility in incorporating prior information. Particular challenges arise when the number of outcomes is large relative to the number of studies because the number of parameters in the variance–covariance matrix can become substantial and there can be very little information with which to estimate between‐study correlation coefficients. We explore assumptions that reduce the number of parameters in this matrix, including assumptions of homogenous variances, homogenous correlations for certain outcomes and positive correlation coefficients. We illustrate the methods with an example data set from the Cochrane Database of Systematic Reviews. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

10.
New quasi-imputation and expansion strategies for correlated binary responses are proposed by borrowing ideas from random number generation. The core idea is to convert correlated binary outcomes to multivariate normal outcomes in a sensible way so that re-conversion to the binary scale, after performing multiple imputation, yields the original specified marginal expectations and correlations. This conversion process ensures that the correlations are transformed reasonably which in turn allows us to take advantage of well-developed imputation techniques for Gaussian outcomes. We use the phrase 'quasi' because the original observations are not guaranteed to be preserved. We argue that if the inferential goals are well-defined, it is not necessary to strictly adhere to the established definition of multiple imputation. Our expansion scheme employs a similar strategy where imputation is used as an intermediate step. It leads to proportionally inflated observed patterns, forcing the data set to a complete rectangular format. The plausibility of the proposed methodology is examined by applying it to a wide range of simulated data sets that reflect alternative assumptions on complete data populations and missing-data mechanisms. We also present an application using a data set from obesity research. We conclude that the proposed method is a promising tool for handling incomplete longitudinal or clustered binary outcomes under ignorable non-response mechanisms.  相似文献   

11.
Multiple imputation by chained equations is a flexible and practical approach to handling missing data. We describe the principles of the method and show how to impute categorical and quantitative variables, including skewed variables. We give guidance on how to specify the imputation model and how many imputations are needed. We describe the practical analysis of multiply imputed data, including model building and model checking. We stress the limitations of the method and discuss the possible pitfalls. We illustrate the ideas using a data set in mental health, giving Stata code fragments.  相似文献   

12.
Missing outcome data are a common threat to the validity of the results from randomised controlled trials (RCTs), which, if not analysed appropriately, can lead to misleading treatment effect estimates. Studies with missing outcome data also threaten the validity of any meta‐analysis that includes them. A conceptually simple Bayesian framework is proposed, to account for uncertainty due to missing binary outcome data in meta‐analysis. A pattern‐mixture model is fitted, which allows the incorporation of prior information on a parameter describing the missingness mechanism. We describe several alternative parameterisations, with the simplest being a prior on the probability of an event in the missing individuals. We describe a series of structural assumptions that can be made concerning the missingness parameters. We use some artificial data scenarios to demonstrate the ability of the model to produce a bias‐adjusted estimate of treatment effect that accounts for uncertainty. A meta‐analysis of haloperidol versus placebo for schizophrenia is used to illustrate the model. We end with a discussion of elicitation of priors, issues with poor reporting and potential extensions of the framework. Our framework allows one to make the best use of evidence produced from RCTs with missing outcome data in a meta‐analysis, accounts for any uncertainty induced by missing data and fits easily into a wider evidence synthesis framework for medical decision making. © 2015 The Authors. Statistics in MedicinePublished by John Wiley & Sons Ltd.  相似文献   

13.
Cost‐effectiveness analyses (CEA) conducted alongside randomised trials provide key evidence for informing healthcare decision making, but missing data pose substantive challenges. Recently, there have been a number of developments in methods and guidelines addressing missing data in trials. However, it is unclear whether these developments have permeated CEA practice. This paper critically reviews the extent of and methods used to address missing data in recently published trial‐based CEA. Issues of the Health Technology Assessment journal from 2013 to 2015 were searched. Fifty‐two eligible studies were identified. Missing data were very common; the median proportion of trial participants with complete cost‐effectiveness data was 63% (interquartile range: 47%–81%). The most common approach for the primary analysis was to restrict analysis to those with complete data (43%), followed by multiple imputation (30%). Half of the studies conducted some sort of sensitivity analyses, but only 2 (4%) considered possible departures from the missing‐at‐random assumption. Further improvements are needed to address missing data in cost‐effectiveness analyses conducted alongside randomised trials. These should focus on limiting the extent of missing data, choosing an appropriate method for the primary analysis that is valid under contextually plausible assumptions, and conducting sensitivity analyses to departures from the missing‐at‐random assumption.  相似文献   

14.
Meta‐analysis using individual participant data (IPD) obtains and synthesises the raw, participant‐level data from a set of relevant studies. The IPD approach is becoming an increasingly popular tool as an alternative to traditional aggregate data meta‐analysis, especially as it avoids reliance on published results and provides an opportunity to investigate individual‐level interactions, such as treatment‐effect modifiers. There are two statistical approaches for conducting an IPD meta‐analysis: one‐stage and two‐stage. The one‐stage approach analyses the IPD from all studies simultaneously, for example, in a hierarchical regression model with random effects. The two‐stage approach derives aggregate data (such as effect estimates) in each study separately and then combines these in a traditional meta‐analysis model. There have been numerous comparisons of the one‐stage and two‐stage approaches via theoretical consideration, simulation and empirical examples, yet there remains confusion regarding when each approach should be adopted, and indeed why they may differ. In this tutorial paper, we outline the key statistical methods for one‐stage and two‐stage IPD meta‐analyses, and provide 10 key reasons why they may produce different summary results. We explain that most differences arise because of different modelling assumptions, rather than the choice of one‐stage or two‐stage itself. We illustrate the concepts with recently published IPD meta‐analyses, summarise key statistical software and provide recommendations for future IPD meta‐analyses. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.  相似文献   

15.
The use of individual participant data (IPD) from multiple studies is an increasingly popular approach when developing a multivariable risk prediction model. Corresponding datasets, however, typically differ in important aspects, such as baseline risk. This has driven the adoption of meta‐analytical approaches for appropriately dealing with heterogeneity between study populations. Although these approaches provide an averaged prediction model across all studies, little guidance exists about how to apply or validate this model to new individuals or study populations outside the derivation data. We consider several approaches to develop a multivariable logistic regression model from an IPD meta‐analysis (IPD‐MA) with potential between‐study heterogeneity. We also propose strategies for choosing a valid model intercept for when the model is to be validated or applied to new individuals or study populations. These strategies can be implemented by the IPD‐MA developers or future model validators. Finally, we show how model generalizability can be evaluated when external validation data are lacking using internal–external cross‐validation and extend our framework to count and time‐to‐event data. In an empirical evaluation, our results show how stratified estimation allows study‐specific model intercepts, which can then inform the intercept to be used when applying the model in practice, even to a population not represented by included studies. In summary, our framework allows the development (through stratified estimation), implementation in new individuals (through focused intercept choice), and evaluation (through internal–external validation) of a single, integrated prediction model from an IPD‐MA in order to achieve improved model performance and generalizability. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

16.
Multivariate random effects meta‐analysis (MRMA) is an appropriate way for synthesizing data from studies reporting multiple correlated outcomes. In a Bayesian framework, it has great potential for integrating evidence from a variety of sources. In this paper, we propose a Bayesian model for MRMA of mixed outcomes, which extends previously developed bivariate models to the trivariate case and also allows for combination of multiple outcomes that are both continuous and binary. We have constructed informative prior distributions for the correlations by using external evidence. Prior distributions for the within‐study correlations were constructed by employing external individual patent data and using a double bootstrap method to obtain the correlations between mixed outcomes. The between‐study model of MRMA was parameterized in the form of a product of a series of univariate conditional normal distributions. This allowed us to place explicit prior distributions on the between‐study correlations, which were constructed using external summary data. Traditionally, independent ‘vague’ prior distributions are placed on all parameters of the model. In contrast to this approach, we constructed prior distributions for the between‐study model parameters in a way that takes into account the inter‐relationship between them. This is a flexible method that can be extended to incorporate mixed outcomes other than continuous and binary and beyond the trivariate case. We have applied this model to a motivating example in rheumatoid arthritis with the aim of incorporating all available evidence in the synthesis and potentially reducing uncertainty around the estimate of interest. © 2013 The Authors. Statistics inMedicine Published by John Wiley & Sons, Ltd.  相似文献   

17.
It is common in applied research to have large numbers of variables measured on a modest number of cases. Even with low rates of missingness of individual variables, such data sets can have a large number of incomplete cases with a mix of data types. Here, we propose a new joint modeling approach to address the high‐dimensional incomplete data with a mix of continuous and binary data. Specifically, we propose a multivariate normal model encompassing both continuous variables and latent variables corresponding to binary variables. We apply a parameter‐extended Metropolis–Hastings algorithm to generate the covariance matrix of a mixture of continuous and binary variables. We also introduce prior distribution families for unstructured covariance matrices to reduce the dimension of the parameter space. In several simulation settings, the method is compared with available‐case analysis, a rounding method, and a sequential regression method. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

18.
Although recent guidelines for dealing with missing data emphasize the need for sensitivity analyses, and such analyses have a long history in statistics, universal recommendations for conducting and displaying these analyses are scarce. We propose graphical displays that help formalize and visualize the results of sensitivity analyses, building upon the idea of ‘tipping‐point’ analysis for randomized experiments with a binary outcome and a dichotomous treatment. The resulting ‘enhanced tipping‐point displays’ are convenient summaries of conclusions obtained from making different modeling assumptions about missingness mechanisms. The primary goal of the displays is to make formal sensitivity analysesmore comprehensible to practitioners, thereby helping them assess the robustness of the experiment's conclusions to plausible missingness mechanisms. We also present a recent example of these enhanced displays in amedical device clinical trial that helped lead to FDA approval. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

19.
20.
We propose a propensity score-based multiple imputation (MI) method to tackle incomplete missing data resulting from drop-outs and/or intermittent skipped visits in longitudinal clinical trials with binary responses. The estimation and inferential properties of the proposed method are contrasted via simulation with those of the commonly used complete-case (CC) and generalized estimating equations (GEE) methods. Three key results are noted. First, if data are missing completely at random, MI can be notably more efficient than the CC and GEE methods. Second, with small samples, GEE often fails due to 'convergence problems', but MI is free of that problem. Finally, if the data are missing at random, while the CC and GEE methods yield results with moderate to large bias, MI generally yields results with negligible bias. A numerical example with real data is provided for illustration.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号