首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
Although missing outcome data are an important problem in randomized trials and observational studies, methods to address this issue can be difficult to apply. Using simulated data, the authors compared 3 methods to handle missing outcome data: 1) complete case analysis; 2) single imputation; and 3) multiple imputation (all 3 with and without covariate adjustment). Simulated scenarios focused on continuous or dichotomous missing outcome data from randomized trials or observational studies. When outcomes were missing at random, single and multiple imputations yielded unbiased estimates after covariate adjustment. Estimates obtained by complete case analysis with covariate adjustment were unbiased as well, with coverage close to 95%. When outcome data were missing not at random, all methods gave biased estimates, but handling missing outcome data by means of 1 of the 3 methods reduced bias compared with a complete case analysis without covariate adjustment. Complete case analysis with covariate adjustment and multiple imputation yield similar estimates in the event of missing outcome data, as long as the same predictors of missingness are included. Hence, complete case analysis with covariate adjustment can and should be used as the analysis of choice more often. Multiple imputation, in addition, can accommodate the missing-not-at-random scenario more flexibly, making it especially suited for sensitivity analyses.  相似文献   

2.
OBJECTIVE: To evaluate the effects of missing data on analyses of data from trauma databases, and to verify whether commonly used techniques for handling missing data work well in theses settings. STUDY DESIGN AND SETTING: Measures of trauma severity such as the Pre-Hospital Index (PHI) are used for triage and the evaluation of trauma care. As conditions of trauma patients can rapidly change over time, estimating the change in PHI from the arrival at the emergency room to hospital admission is important. We used both simulated and real data to investigate the estimation of PHI data when some data are missing. Techniques compared include complete case analysis, single imputation, and multiple imputation. RESULTS: It is well known that complete case analyses and single imputation methods often lead to highly misleading results that can be corrected by multiple imputation, an increasingly popular method for missing data situations. In practice, unverifiable assumptions may not hold, meaning that it may not be possible to draw definitive conclusions from any of the methods. CONCLUSION: Great care is required whenever missing data arises. This is especially true in trauma databases, which often have much missing data and where the data may not missing at random.  相似文献   

3.
Missing observations are common in cluster randomised trials. The problem is exacerbated when modelling bivariate outcomes jointly, as the proportion of complete cases is often considerably smaller than the proportion having either of the outcomes fully observed. Approaches taken to handling such missing data include the following: complete case analysis, single‐level multiple imputation that ignores the clustering, multiple imputation with a fixed effect for each cluster and multilevel multiple imputation. We contrasted the alternative approaches to handling missing data in a cost‐effectiveness analysis that uses data from a cluster randomised trial to evaluate an exercise intervention for care home residents. We then conducted a simulation study to assess the performance of these approaches on bivariate continuous outcomes, in terms of confidence interval coverage and empirical bias in the estimated treatment effects. Missing‐at‐random clustered data scenarios were simulated following a full‐factorial design. Across all the missing data mechanisms considered, the multiple imputation methods provided estimators with negligible bias, while complete case analysis resulted in biased treatment effect estimates in scenarios where the randomised treatment arm was associated with missingness. Confidence interval coverage was generally in excess of nominal levels (up to 99.8%) following fixed‐effects multiple imputation and too low following single‐level multiple imputation. Multilevel multiple imputation led to coverage levels of approximately 95% throughout. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.  相似文献   

4.
The purpose of this paper was to illustrate the influence of missing data on the results of longitudinal statistical analyses [i.e., MANOVA for repeated measurements and Generalised Estimating Equations (GEE)] and to illustrate the influence of using different imputation methods to replace missing data. Besides a complete dataset, four incomplete datasets were considered: two datasets with 10% missing data and two datasets with 25% missing data. In both situations missingness was considered independent and dependent on observed data. Imputation methods were divided into cross-sectional methods (i.e., mean of series, hot deck, and cross-sectional regression) and longitudinal methods (i.e., last value carried forward, longitudinal interpolation, and longitudinal regression). Besides these, also the multiple imputation method was applied and discussed. The analyses were performed on a particular (observational) longitudinal dataset, with particular missing data patterns and imputation methods. The results of this illustration shows that when MANOVA for repeated measurements is used, imputation methods are highly recommendable (because MANOVA as implemented in the software used, uses listwise deletion of cases with a missing value). Applying GEE analysis, imputation methods were not necessary. When imputation methods were used, longitudinal imputation methods were often preferable above cross-sectional imputation methods, in a way that the point estimates and standard errors were closer to the estimates derived from the complete dataset. Furthermore, this study showed that the theoretically more valid multiple imputation method did not lead to different point estimates than the more simple (longitudinal) imputation methods. However, the estimated standard errors appeared to be theoretically more adequate, because they reflect the uncertainty in estimation caused by missing values.  相似文献   

5.
ObjectiveWe compared popular methods to handle missing data with multiple imputation (a more sophisticated method that preserves data).Study Design and SettingWe used data of 804 patients with a suspicion of deep venous thrombosis (DVT). We studied three covariates to predict the presence of DVT: d-dimer level, difference in calf circumference, and history of leg trauma. We introduced missing values (missing at random) ranging from 10% to 90%. The risk of DVT was modeled with logistic regression for the three methods, that is, complete case analysis, exclusion of d-dimer level from the model, and multiple imputation.ResultsMultiple imputation showed less bias in the regression coefficients of the three variables and more accurate coverage of the corresponding 90% confidence intervals than complete case analysis and dropping d-dimer level from the analysis. Multiple imputation showed unbiased estimates of the area under the receiver operating characteristic curve (0.88) compared with complete case analysis (0.77) and when the variable with missing values was dropped (0.65).ConclusionAs this study shows that simple methods to deal with missing data can lead to seriously misleading results, we advise to consider multiple imputation. The purpose of multiple imputation is not to create data, but to prevent the exclusion of observed data.  相似文献   

6.
ABSTRACT: BACKGROUND: Multiple imputation is becoming increasingly popular for handling missing data. However, it is often implemented without adequate consideration of whether it offers any advantage over complete case analysis for the research question of interest, or whether potential gains may be offset by bias from a poorly fitting imputation model, particularly as the amount of missing data increases. METHODS: Simulated datasets (n = 1000) drawn from a synthetic population were used to explore information recovery from multiple imputation in estimating the coefficient of a binary exposure variable when various proportions of data (10-90%) were set missing at random in a highly-skewed continuous covariate or in the binary exposure. Imputation was performed using multivariate normal imputation (MVNI), with a simple or zero-skewness log transformation to manage non-normality. Bias, precision, mean-squared error and coverage for a set of regression parameter estimates were compared between multiple imputation and complete case analyses. RESULTS: For missingness in the continuous covariate, multiple imputation produced less bias and greater precision for the effect of the binary exposure variable, compared with complete case analysis, with larger gains in precision with more missing data. However, even with only moderate missingness, large bias and substantial under-coverage were apparent in estimating the continuous covariate's effect when skewness was not adequately addressed. For missingness in the binary covariate, all estimates had negligible bias but gains in precision from multiple imputation were minimal, particularly for the coefficient of the binary exposure. CONCLUSIONS: Although multiple imputation can be useful if covariates required for confounding adjustment are missing, benefits are likely to be minimal when data are missing in the exposure variable of interest. Furthermore, when there are large amounts of missingness, multiple imputation can become unreliable and introduce bias not present in a complete case analysis if the imputation model is not appropriate. Epidemiologists dealing with missing data should keep in mind the potential limitations as well as the potential benefits of multiple imputation. Further work is needed to provide clearer guidelines on effective application of this method.  相似文献   

7.
We consider a study‐level meta‐analysis with a normally distributed outcome variable and possibly unequal study‐level variances, where the object of inference is the difference in means between a treatment and control group. A common complication in such an analysis is missing sample variances for some studies. A frequently used approach is to impute the weighted (by sample size) mean of the observed variances (mean imputation). Another approach is to include only those studies with variances reported (complete case analysis). Both mean imputation and complete case analysis are only valid under the missing‐completely‐at‐random assumption, and even then the inverse variance weights produced are not necessarily optimal. We propose a multiple imputation method employing gamma meta‐regression to impute the missing sample variances. Our method takes advantage of study‐level covariates that may be used to provide information about the missing data. Through simulation studies, we show that multiple imputation, when the imputation model is correctly specified, is superior to competing methods in terms of confidence interval coverage probability and type I error probability when testing a specified group difference. Finally, we describe a similar approach to handling missing variances in cross‐over studies. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

8.
BACKGROUND AND OBJECTIVES: To illustrate the effects of different methods for handling missing data--complete case analysis, missing-indicator method, single imputation of unconditional and conditional mean, and multiple imputation (MI)--in the context of multivariable diagnostic research aiming to identify potential predictors (test results) that independently contribute to the prediction of disease presence or absence. METHODS: We used data from 398 subjects from a prospective study on the diagnosis of pulmonary embolism. Various diagnostic predictors or tests had (varying percentages of) missing values. Per method of handling these missing values, we fitted a diagnostic prediction model using multivariable logistic regression analysis. RESULTS: The receiver operating characteristic curve area for all diagnostic models was above 0.75. The predictors in the final models based on the complete case analysis, and after using the missing-indicator method, were very different compared to the other models. The models based on MI did not differ much from the models derived after using single conditional and unconditional mean imputation. CONCLUSION: In multivariable diagnostic research complete case analysis and the use of the missing-indicator method should be avoided, even when data are missing completely at random. MI methods are known to be superior to single imputation methods. For our example study, the single imputation methods performed equally well, but this was most likely because of the low overall number of missing values.  相似文献   

9.
BackgroundWe previously developed an approach to address the impact of missing participant data in meta-analyses of continuous variables in trials that used the same measurement instrument. We extend this approach to meta-analyses including trials that use different instruments to measure the same construct.MethodsWe reviewed the available literature, conducted an iterative consultative process, and developed an approach involving a complete-case analysis complemented by sensitivity analyses that apply a series of increasingly stringent assumptions about results in patients with missing continuous outcome data.ResultsOur approach involves choosing the reference measurement instrument; converting scores from different instruments to the units of the reference instrument; developing four successively more stringent imputation strategies for addressing missing participant data; calculating a pooled mean difference for the complete-case analysis and imputation strategies; calculating the proportion of patients who experienced an important treatment effect; and judging the impact of the imputation strategies on the confidence in the estimate of effect. We applied our approach to an example systematic review of respiratory rehabilitation for chronic obstructive pulmonary disease.ConclusionsOur extended approach provides quantitative guidance for addressing missing participant data in systematic reviews of trials using different instruments to measure the same construct.  相似文献   

10.
There are many advantages to individual participant data meta‐analysis for combining data from multiple studies. These advantages include greater power to detect effects, increased sample heterogeneity, and the ability to perform more sophisticated analyses than meta‐analyses that rely on published results. However, a fundamental challenge is that it is unlikely that variables of interest are measured the same way in all of the studies to be combined. We propose that this situation can be viewed as a missing data problem in which some outcomes are entirely missing within some trials and use multiple imputation to fill in missing measurements. We apply our method to five longitudinal adolescent depression trials where four studies used one depression measure and the fifth study used a different depression measure. None of the five studies contained both depression measures. We describe a multiple imputation approach for filling in missing depression measures that makes use of external calibration studies in which both depression measures were used. We discuss some practical issues in developing the imputation model including taking into account treatment group and study. We present diagnostics for checking the fit of the imputation model and investigate whether external information is appropriately incorporated into the imputed values. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

11.
目的 提出数据缺失机制识别及处理的标准化操作流程,并开发相应集成系统,为非统计专业背景的医学工作者处理缺失数据提供恰当、专业且简便的实现工具。方法 系统集成了完成者数据集法、K最近邻分类算法和链式方程多元插值法等缺失数据处理方法,并将其归纳到缺失机制识别及处理的统一框架下,为缺失数据处理提供了从缺失统计,缺失机制识别到缺失处理的标准化流程。结果 将归纳的标准化流程分步骤开发为缺失统计、缺失识别、缺失处理等功能模块并进行了集成化,构建了缺失机制识别及处理集成系统。结论 标准化操作流程及集成系统实现了缺失机制识别加缺失数据处理全过程,操作方式简单便捷,结果展示直观易懂,为缺失数据的处理提供了更为简便可行的选择,便于医学工作者实际应用。  相似文献   

12.
Propensity score models are frequently used to estimate causal effects in observational studies. One unresolved issue in fitting these models is handling missing values in the propensity score model covariates. As these models usually contain a large set of covariates, using only individuals with complete data significantly decreases the sample size and statistical power. Several missing data imputation approaches have been proposed, including multiple imputation (MI), MI with missingness pattern (MIMP), and treatment mean imputation. Generalized boosted modeling (GBM), which is a nonparametric approach to estimate propensity scores, can automatically handle missingness in the covariates. Although the performance of MI, MIMP, and treatment mean imputation have previously been compared for binary treatments, they have not been compared for continuous exposures or with single imputation and GBM. We compared these approaches in estimating the generalized propensity score (GPS) for a continuous exposure in both a simulation study and in empirical data. Using GBM with the incomplete data to estimate the GPS did not perform well in the simulation. Missing values should be imputed before estimating propensity scores using GBM or any other approach for estimating the GPS.  相似文献   

13.
Multiple imputation can be a good solution to handling missing data if data are missing at random. However, this assumption is often difficult to verify. We describe an application of multiple imputation that makes this assumption plausible. This procedure requires contacting a random sample of subjects with incomplete data to fill in the missing information, and then adjusting the imputation model to incorporate the new data. Simulations with missing data that were decidedly not missing at random showed, as expected, that the method restored the original beta coefficients, whereas other methods of dealing with missing data failed. Using a dataset with real missing data, we found that different approaches to imputation produced moderately different results. Simulations suggest that filling in 10% of data that was initially missing is sufficient for imputation in many epidemiologic applications, and should produce approximately unbiased results, provided there is a high response on follow-up from the subsample of those with some originally missing data. This response can probably be achieved if this data collection is planned as an initial approach to dealing with the missing data, rather than at later stages, after further attempts that leave only data that is very difficult to complete.  相似文献   

14.
Missing outcome data are commonly encountered in randomized controlled trials and hence may need to be addressed in a meta‐analysis of multiple trials. A common and simple approach to deal with missing data is to restrict analysis to individuals for whom the outcome was obtained (complete case analysis). However, estimated treatment effects from complete case analyses are potentially biased if informative missing data are ignored. We develop methods for estimating meta‐analytic summary treatment effects for continuous outcomes in the presence of missing data for some of the individuals within the trials. We build on a method previously developed for binary outcomes, which quantifies the degree of departure from a missing at random assumption via the informative missingness odds ratio. Our new model quantifies the degree of departure from missing at random using either an informative missingness difference of means or an informative missingness ratio of means, both of which relate the mean value of the missing outcome data to that of the observed data. We propose estimating the treatment effects, adjusted for informative missingness, and their standard errors by a Taylor series approximation and by a Monte Carlo method. We apply the methodology to examples of both pairwise and network meta‐analysis with multi‐arm trials. © 2014 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.  相似文献   

15.
ObjectivesRegardless of the proportion of missing values, complete-case analysis is most frequently applied, although advanced techniques such as multiple imputation (MI) are available. The objective of this study was to explore the performance of simple and more advanced methods for handling missing data in cases when some, many, or all item scores are missing in a multi-item instrument.Study Design and SettingReal-life missing data situations were simulated in a multi-item variable used as a covariate in a linear regression model. Various missing data mechanisms were simulated with an increasing percentage of missing data. Subsequently, several techniques to handle missing data were applied to decide on the most optimal technique for each scenario. Fitted regression coefficients were compared using the bias and coverage as performance parameters.ResultsMean imputation caused biased estimates in every missing data scenario when data are missing for more than 10% of the subjects. Furthermore, when a large percentage of subjects had missing items (>25%), MI methods applied to the items outperformed methods applied to the total score.ConclusionWe recommend applying MI to the item scores to get the most accurate regression model estimates. Moreover, we advise not to use any form of mean imputation to handle missing data.  相似文献   

16.
Individual participant data meta‐analyses (IPD‐MA) are increasingly used for developing and validating multivariable (diagnostic or prognostic) risk prediction models. Unfortunately, some predictors or even outcomes may not have been measured in each study and are thus systematically missing in some individual studies of the IPD‐MA. As a consequence, it is no longer possible to evaluate between‐study heterogeneity and to estimate study‐specific predictor effects, or to include all individual studies, which severely hampers the development and validation of prediction models. Here, we describe a novel approach for imputing systematically missing data and adopt a generalized linear mixed model to allow for between‐study heterogeneity. This approach can be viewed as an extension of Resche‐Rigon's method (Stat Med 2013), relaxing their assumptions regarding variance components and allowing imputation of linear and nonlinear predictors. We illustrate our approach using a case study with IPD‐MA of 13 studies to develop and validate a diagnostic prediction model for the presence of deep venous thrombosis. We compare the results after applying four methods for dealing with systematically missing predictors in one or more individual studies: complete case analysis where studies with systematically missing predictors are removed, traditional multiple imputation ignoring heterogeneity across studies, stratified multiple imputation accounting for heterogeneity in predictor prevalence, and multilevel multiple imputation (MLMI) fully accounting for between‐study heterogeneity. We conclude that MLMI may substantially improve the estimation of between‐study heterogeneity parameters and allow for imputation of systematically missing predictors in IPD‐MA aimed at the development and validation of prediction models. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

17.
A variable is ‘systematically missing’ if it is missing for all individuals within particular studies in an individual participant data meta‐analysis. When a systematically missing variable is a potential confounder in observational epidemiology, standard methods either fail to adjust the exposure–disease association for the potential confounder or exclude studies where it is missing. We propose a new approach to adjust for systematically missing confounders based on multiple imputation by chained equations. Systematically missing data are imputed via multilevel regression models that allow for heterogeneity between studies. A simulation study compares various choices of imputation model. An illustration is given using data from eight studies estimating the association between carotid intima media thickness and subsequent risk of cardiovascular events. Results are compared with standard methods and also with an extension of a published method that exploits the relationship between fully adjusted and partially adjusted estimated effects through a multivariate random effects meta‐analysis model. We conclude that multiple imputation provides a practicable approach that can handle arbitrary patterns of systematic missingness. Bias is reduced by including sufficient between‐study random effects in the imputation model. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

18.
《Value in health》2022,25(9):1654-1662
ObjectivesCost-effectiveness analysis (CEA) alongside randomized controlled trials often relies on self-reported multi-item questionnaires that are invariably prone to missing item-level data. The purpose of this study is to review how missing multi-item questionnaire data are handled in trial-based CEAs.MethodsWe searched the National Institute for Health Research journals to identify within-trial CEAs published between January 2016 and April 2021 using multi-item instruments to collect costs and quality of life (QOL) data. Information on missing data handling and methods, with a focus on the level and type of imputation, was extracted.ResultsA total of 87 trial-based CEAs were included in the review. Complete case analysis or available case analysis and multiple imputation (MI) were the most popular methods, selected by similar numbers of studies, to handle missing costs and QOL in base-case analysis. Nevertheless, complete case analysis or available case analysis dominated sensitivity analysis. Once imputation was chosen, missing costs were widely imputed at item-level via MI, whereas missing QOL was usually imputed at the more aggregated time point level during the follow-up via MI.ConclusionsMissing costs and QOL tend to be imputed at different levels of missingness in current CEAs alongside randomized controlled trials. Given the limited information provided by included studies, the impact of applying different imputation methods at different levels of aggregation on CEA decision making remains unclear.  相似文献   

19.
An efficient monotone data augmentation (MDA) algorithm is proposed for missing data imputation for incomplete multivariate nonnormal data that may contain variables of different types and are modeled by a sequence of regression models including the linear, binary logistic, multinomial logistic, proportional odds, Poisson, negative binomial, skew-normal, skew-t regressions, or a mixture of these models. The MDA algorithm is applied to the sensitivity analyses of longitudinal trials with nonignorable dropout using the controlled pattern imputations that assume the treatment effect reduces or disappears after subjects in the experimental arm discontinue the treatment. We also describe a heuristic approach to implement the controlled imputation, in which the fully conditional specification method is used to impute the intermediate missing data to create a monotone missing pattern, and the missing data after dropout are then imputed according to the assumed nonignorable mechanisms. The proposed methods are illustrated by simulation and real data analyses. Sample SAS code for the analyses is provided in the supporting information  相似文献   

20.
Review: a gentle introduction to imputation of missing values   总被引:1,自引:0,他引:1  
In most situations, simple techniques for handling missing data (such as complete case analysis, overall mean imputation, and the missing-indicator method) produce biased results, whereas imputation techniques yield valid results without complicating the analysis once the imputations are carried out. Imputation techniques are based on the idea that any subject in a study sample can be replaced by a new randomly chosen subject from the same source population. Imputation of missing data on a variable is replacing that missing by a value that is drawn from an estimate of the distribution of this variable. In single imputation, only one estimate is used. In multiple imputation, various estimates are used, reflecting the uncertainty in the estimation of this distribution. Under the general conditions of so-called missing at random and missing completely at random, both single and multiple imputations result in unbiased estimates of study associations. But single imputation results in too small estimated standard errors, whereas multiple imputation results in correctly estimated standard errors and confidence intervals. In this article we explain why all this is the case, and use a simple simulation study to demonstrate our explanations. We also explain and illustrate why two frequently used methods to handle missing data, i.e., overall mean imputation and the missing-indicator method, almost always result in biased estimates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号