首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Introduction

For the analysis of clinical effects, multiple imputation (MI) of missing data were shown to be unnecessary when using longitudinal linear mixed-models (LLM). It remains unclear whether this also applies to trial-based economic evaluations. Therefore, this study aimed to assess whether MI is required prior to LLM when analyzing longitudinal cost and effect data.

Methods

Two-thousand complete datasets were simulated containing five time points. Incomplete datasets were generated with 10, 25, and 50% missing data in follow-up costs and effects, assuming a Missing At Random (MAR) mechanism. Six different strategies were compared using empirical bias (EB), root-mean-squared error (RMSE), and coverage rate (CR). These strategies were: LLM alone (LLM) and MI with LLM (MI-LLM), and, as reference strategies, mean imputation with LLM (M-LLM), seemingly unrelated regression alone (SUR-CCA), MI with SUR (MI-SUR), and mean imputation with SUR (M-SUR).

Results

For costs and effects, LLM, MI-LLM, and MI-SUR performed better than M-LLM, SUR-CCA, and M-SUR, with smaller EBs and RMSEs as well as CRs closers to nominal levels. However, even though LLM, MI-LLM and MI-SUR performed equally well for effects, MI-LLM and MI-SUR were found to perform better than LLM for costs at 10 and 25% missing data. At 50% missing data, all strategies resulted in relatively high EBs and RMSEs for costs.

Conclusion

LLM should be combined with MI when analyzing trial-based economic evaluation data. MI-SUR is more efficient and can also be used, but then an average intervention effect over time cannot be estimated.

  相似文献   

2.
ObjectivesRegardless of the proportion of missing values, complete-case analysis is most frequently applied, although advanced techniques such as multiple imputation (MI) are available. The objective of this study was to explore the performance of simple and more advanced methods for handling missing data in cases when some, many, or all item scores are missing in a multi-item instrument.Study Design and SettingReal-life missing data situations were simulated in a multi-item variable used as a covariate in a linear regression model. Various missing data mechanisms were simulated with an increasing percentage of missing data. Subsequently, several techniques to handle missing data were applied to decide on the most optimal technique for each scenario. Fitted regression coefficients were compared using the bias and coverage as performance parameters.ResultsMean imputation caused biased estimates in every missing data scenario when data are missing for more than 10% of the subjects. Furthermore, when a large percentage of subjects had missing items (>25%), MI methods applied to the items outperformed methods applied to the total score.ConclusionWe recommend applying MI to the item scores to get the most accurate regression model estimates. Moreover, we advise not to use any form of mean imputation to handle missing data.  相似文献   

3.

Objective

The Mini-Mental State Examination (MMSE) is used to estimate current cognitive status and as a screen for possible dementia. Missing item-level data are commonly reported. Attention to missing data is particularly important. However, there are concerns that common procedures for dealing with missing data, for example, listwise deletion and mean item substitution, are inadequate.

Study Design and Setting

We used multiple imputation (MI) to estimate missing MMSE data in 17,303 participants who were drawn from the Dynamic Analyses to Optimize Aging project, a harmonization project of nine Australian longitudinal studies of aging.

Results

Our results indicated differences in mean MMSE scores between those participants with and without missing data, a pattern consistent over age and gender levels. MI inflated MMSE scores, but differences between those imputed and those without missing data still existed. A simulation model supported the efficacy of MI to estimate missing item level, although serious decrements in estimation occurred when 50% or more of item-level data were missing, particularly for the oldest participants.

Conclusions

Our adaptation of MI to obtain a probable estimate for missing MMSE item level data provides a suitable method when the proportion of missing item-level data is not excessive.  相似文献   

4.
5.
Multiple imputation (MI) is one of the most popular methods to deal with missing data, and its use has been rapidly increasing in medical studies. Although MI is rather appealing in practice since it is possible to use ordinary statistical methods for a complete data set once the missing values are fully imputed, the method of imputation is still problematic. If the missing values are imputed from some parametric model, the validity of imputation is not necessarily ensured, and the final estimate for a parameter of interest can be biased unless the parametric model is correctly specified. Nonparametric methods have been also proposed for MI, but it is not so straightforward as to produce imputation values from nonparametrically estimated distributions. In this paper, we propose a new method for MI to obtain a consistent (or asymptotically unbiased) final estimate even if the imputation model is misspecified. The key idea is to use an imputation model from which the imputation values are easily produced and to make a proper correction in the likelihood function after the imputation by using the density ratio between the imputation model and the true conditional density function for the missing variable as a weight. Although the conditional density must be nonparametrically estimated, it is not used for the imputation. The performance of our method is evaluated by both theory and simulation studies. A real data analysis is also conducted to illustrate our method by using the Duke Cardiac Catheterization Coronary Artery Disease Diagnostic Dataset.  相似文献   

6.
ObjectivesIn trial-based economic evaluation, some individuals are typically associated with missing data at some time point, so that their corresponding aggregated outcomes (eg, quality-adjusted life-years) cannot be evaluated. Restricting the analysis to the complete cases is inefficient and can result in biased estimates, while imputation methods are often implemented under a missing at random (MAR) assumption. We propose the use of joint longitudinal models to extend standard approaches by taking into account the longitudinal structure to improve the estimation of the targeted quantities under MAR.MethodsWe compare the results from methods that handle missingness at an aggregated (case deletion, baseline imputation, and joint aggregated models) and disaggregated (joint longitudinal models) level under MAR. The methods are compared using a simulation study and applied to data from 2 real case studies.ResultsSimulations show that, according to which data affect the missingness process, aggregated methods may lead to biased results, while joint longitudinal models lead to valid inferences under MAR. The analysis of the 2 case studies support these results as both parameter estimates and cost-effectiveness results vary based on the amount of data incorporated into the model.ConclusionsOur analyses suggest that methods implemented at the aggregated level are potentially biased under MAR as they ignore the information from the partially observed follow-up data. This limitation can be overcome by extending the analysis to a longitudinal framework using joint models, which can incorporate all the available evidence.  相似文献   

7.
A post hoc analysis of data from a prospective cost-effectiveness analysis (CEA) conducted alongside a randomized controlled trial (National Emphysema Treatment Trial - NETT) was used to assess the impact of using different imputation methods for missing quality of life data on the estimation of the incremental cost-effectiveness ratio (ICER). The NETT compared lung-volume-reduction surgery plus medical therapy with medical therapy alone in patients with severe chronic obstructive pulmonary disease due to emphysema. One thousand sixty-six patients were followed for up to 3 years after randomization. The cost per quality-adjusted life-year gained was obtained, computing costs from a societal perspective and using the self-administered Quality of Well Being questionnaire to measure quality of life. Different methods of imputation resulted in substantial differences in ICERs as well as differences in estimates of the uncertainty in the point estimates as reflected in the CEA acceptability curves. Paradoxically, the use of a conservative single imputation method resulted in relatively less uncertainty (anticonservative) about the ICER. Owing to the effects of different imputation methods for missing quality of life data on the estimation of the ICER, we recommend use of a minimum of two imputation methods that always include multiple imputation.  相似文献   

8.
多重填补的方法及其统计推断原理   总被引:6,自引:0,他引:6  
目的 描述数据缺失的特征和数据缺失模式,对Rubin最早提出的多重填补(multiple imputation,MI)的基本概念、填补和分析缺失数据的方法、综合统计推断进行了探讨,分析了MI的特点、局限性以及应用MI方法处理不完整数据集时需要注意的地方。方法 通过计算机模拟,用MI方法将每一个缺失值用一系列可能的值填补,然后使用常规的、针对完全数据集的统计方法对多重填补后得到的若干数据集进行分析,并把所得的结果进行综合。结果 多重填补值显示出了缺失数据的不确定性,使得已有数据得到了充分利用,从而对总体参数做出了更为准确的估计。结论 MI方法为处理存在缺失值的数据集提供了有用的策略,并且适用于多种数据缺失的场合。  相似文献   

9.
ObjectiveWe compared popular methods to handle missing data with multiple imputation (a more sophisticated method that preserves data).Study Design and SettingWe used data of 804 patients with a suspicion of deep venous thrombosis (DVT). We studied three covariates to predict the presence of DVT: d-dimer level, difference in calf circumference, and history of leg trauma. We introduced missing values (missing at random) ranging from 10% to 90%. The risk of DVT was modeled with logistic regression for the three methods, that is, complete case analysis, exclusion of d-dimer level from the model, and multiple imputation.ResultsMultiple imputation showed less bias in the regression coefficients of the three variables and more accurate coverage of the corresponding 90% confidence intervals than complete case analysis and dropping d-dimer level from the analysis. Multiple imputation showed unbiased estimates of the area under the receiver operating characteristic curve (0.88) compared with complete case analysis (0.77) and when the variable with missing values was dropped (0.65).ConclusionAs this study shows that simple methods to deal with missing data can lead to seriously misleading results, we advise to consider multiple imputation. The purpose of multiple imputation is not to create data, but to prevent the exclusion of observed data.  相似文献   

10.
Missing data due to loss to follow-up or intercurrent events are unintended, but unfortunately inevitable in clinical trials. Since the true values of missing data are never known, it is necessary to assess the impact of untestable and unavoidable assumptions about any unobserved data in sensitivity analysis. This tutorial provides an overview of controlled multiple imputation (MI) techniques and a practical guide to their use for sensitivity analysis of trials with missing continuous outcome data. These include δ- and reference-based MI procedures. In δ-based imputation, an offset term, δ, is typically added to the expected value of the missing data to assess the impact of unobserved participants having a worse or better response than those observed. Reference-based imputation draws imputed values with some reference to observed data in other groups of the trial, typically in other treatment arms. We illustrate the accessibility of these methods using data from a pediatric eczema trial and a chronic headache trial and provide Stata code to facilitate adoption. We discuss issues surrounding the choice of δ in δ-based sensitivity analysis. We also review the debate on variance estimation within reference-based analysis and justify the use of Rubin's variance estimator in this setting, since as we further elaborate on within, it provides information anchored inference.  相似文献   

11.
ObjectiveMissing data are a pervasive problem, often leading to bias in complete records analysis (CRA). Multiple imputation (MI) via chained equations is one solution, but its use in the presence of interactions is not straightforward.Study Design and SettingWe simulated data with outcome Y dependent on binary explanatory variables X and Z and their interaction XZ. Six scenarios were simulated (Y continuous and binary, each with no interaction, a weak and a strong interaction), under five missing data mechanisms. We use directed acyclic graphs to identify when CRA and MI would each be unbiased. We evaluate the performance of CRA, MI without interactions, MI including all interactions, and stratified imputation. We also illustrated these methods using a simple example from the National Child Development Study (NCDS).ResultsMI excluding interactions is invalid and resulted in biased estimates and low coverage. When XZ was zero, MI excluding interactions gave unbiased estimates but overcoverage. MI including interactions and stratified MI gave equivalent, valid inference in all cases. In the NCDS example, MI excluding interactions incorrectly concluded there was no evidence for an important interaction.ConclusionsEpidemiologists carrying out MI should ensure that their imputation model(s) are compatible with their analysis model.  相似文献   

12.
Purpose

Item non-response (i.e., missing data) may mask the detection of differential item functioning (DIF) in patient-reported outcome measures or result in biased DIF estimates. Non-response can be challenging to address in ordinal data. We investigated an unsupervised machine-learning method for ordinal item-level imputation and compared it with commonly-used item non-response methods when testing for DIF.

Methods

Computer simulation and real-world data were used to assess several item non-response methods using the item response theory likelihood ratio test for DIF. The methods included: (a) list-wise deletion (LD), (b) half-mean imputation (HMI), (c) full information maximum likelihood (FIML), and (d) non-negative matrix factorization (NNMF), which adopts a machine-learning approach to impute missing values. Control of Type I error rates were evaluated using a liberal robustness criterion for α?=?0.05 (i.e., 0.025–0.075). Statistical power was assessed with and without adoption of an item non-response method; differences?>?10% were considered substantial.

Results

Type I error rates for detecting DIF using LD, FIML and NNMF methods were controlled within the bounds of the robustness criterion for?>?95% of simulation conditions, although the NNMF occasionally resulted in inflated rates. The HMI method always resulted in inflated error rates with 50% missing data. Differences in power to detect moderate DIF effects for LD, FIML and NNMF methods were substantial with 50% missing data and otherwise insubstantial.

Conclusion

The NNMF method demonstrated comparable performance to commonly-used non-response methods. This computationally-efficient method represents a promising approach to address item-level non-response when testing for DIF.

  相似文献   

13.
ObjectiveTo illustrate the sequence of steps needed to develop and validate a clinical prediction model, when missing predictor values have been multiply imputed.Study Design and SettingWe used data from consecutive primary care patients suspected of deep venous thrombosis (DVT) to develop and validate a diagnostic model for the presence of DVT. Missing values were imputed 10 times with the MICE conditional imputation method. After the selection of predictors and transformations for continuous predictors according to three different methods, we estimated regression coefficients and performance measures.ResultsThe three methods to select predictors and transformations of continuous predictors showed similar results. Rubin's rules could easily be applied to estimate regression coefficients and performance measures, once predictors and transformations were selected.ConclusionWe provide a practical approach for model development and validation with multiply imputed data.  相似文献   

14.
Economic evaluations must use appropriate costing methods. However, in multicentre cost‐effectiveness analyses (CEA) a fundamental issue of how best to measure and analyse unit costs has been neglected. Multicentre CEA commonly take the mean unit cost from a national database, such as NHS reference costs. This approach does not recognise that unit costs vary across centres and are unavailable in some centres. This paper proposes the use of multiple imputation (MI) to predict those centre‐specific unit costs that are not available, while recognising the statistical uncertainty surrounding this imputation. We illustrate MI with a CEA of a multicentre randomised trial (1014 patients, 60 centres), implemented using multilevel modelling. We use MI to derive centre‐specific unit costs, based on centre characteristics including average casemix, and compare this to using mean NHS reference costs. In this case study, using MI unit costs rather than mean reference costs led to less heterogeneity across centres, more precise estimates of incremental cost, but similar estimates of incremental cost‐effectiveness. We conclude that using MI to predict unit costs can preserve correlations, maximise the use of available data, and, when combined with multilevel modelling is an appropriate method for recognising the statistical uncertainty in multicentre CEA. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

15.
ObjectiveMissing indicator method (MIM) and complete case analysis (CC) are frequently used to handle missing confounder data. Using empirical data, we demonstrated the degree and direction of bias in the effect estimate when using these methods compared with multiple imputation (MI).Study Design and SettingFrom a cohort study, we selected an exposure (marital status), outcome (depression), and confounders (age, sex, and income). Missing values in “income” were created according to different patterns of missingness: missing values were created completely at random and depending on exposure and outcome values. Percentages of missing values ranged from 2.5% to 30%.ResultsWhen missing values were completely random, MIM gave an overestimation of the odds ratio, whereas CC and MI gave unbiased results. MIM and CC gave under- or overestimations when missing values depended on observed values. Magnitude and direction of bias depended on how the missing values were related to exposure and outcome. Bias increased with increasing percentage of missing values.ConclusionMIM should not be used in handling missing confounder data because it gives unpredictable bias of the odds ratio even with small percentages of missing values. CC can be used when missing values are completely random, but it gives loss of statistical power.  相似文献   

16.
Cost and effect data often have missing data because economic evaluations are frequently added onto clinical studies where cost data are rarely the primary outcome. The objective of this article was to investigate which multiple imputation strategy is most appropriate to use for missing cost-effectiveness data in a randomized controlled trial. Three incomplete data sets were generated from a complete reference data set with 17, 35 and 50 % missing data in effects and costs. The strategies evaluated included complete case analysis (CCA), multiple imputation with predictive mean matching (MI-PMM), MI-PMM on log-transformed costs (log MI-PMM), and a two-step MI. Mean cost and effect estimates, standard errors and incremental net benefits were compared with the results of the analyses on the complete reference data set. The CCA, MI-PMM, and the two-step MI strategy diverged from the results for the reference data set when the amount of missing data increased. In contrast, the estimates of the Log MI-PMM strategy remained stable irrespective of the amount of missing data. MI provided better estimates than CCA in all scenarios. With low amounts of missing data the MI strategies appeared equivalent but we recommend using the log MI-PMM with missing data greater than 35 %.  相似文献   

17.
In many large prospective cohorts, expensive exposure measurements cannot be obtained for all individuals. Exposure–disease association studies are therefore often based on nested case–control or case–cohort studies in which complete information is obtained only for sampled individuals. However, in the full cohort, there may be a large amount of information on cheaply available covariates and possibly a surrogate of the main exposure(s), which typically goes unused. We view the nested case–control or case–cohort study plus the remainder of the cohort as a full‐cohort study with missing data. Hence, we propose using multiple imputation (MI) to utilise information in the full cohort when data from the sub‐studies are analysed. We use the fully observed data to fit the imputation models. We consider using approximate imputation models and also using rejection sampling to draw imputed values from the true distribution of the missing values given the observed data. Simulation studies show that using MI to utilise full‐cohort information in the analysis of nested case–control and case–cohort studies can result in important gains in efficiency, particularly when a surrogate of the main exposure is available in the full cohort. In simulations, this method outperforms counter‐matching in nested case–control studies and a weighted analysis for case–cohort studies, both of which use some full‐cohort information. Approximate imputation models perform well except when there are interactions or non‐linear terms in the outcome model, where imputation using rejection sampling works well. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

18.
Multiple imputation (MI) is a commonly used technique for handling missing data in large‐scale medical and public health studies. However, variable selection on multiply‐imputed data remains an important and longstanding statistical problem. If a variable selection method is applied to each imputed dataset separately, it may select different variables for different imputed datasets, which makes it difficult to interpret the final model or draw scientific conclusions. In this paper, we propose a novel multiple imputation‐least absolute shrinkage and selection operator (MI‐LASSO) variable selection method as an extension of the least absolute shrinkage and selection operator (LASSO) method to multiply‐imputed data. The MI‐LASSO method treats the estimated regression coefficients of the same variable across all imputed datasets as a group and applies the group LASSO penalty to yield a consistent variable selection across multiple‐imputed datasets. We use a simulation study to demonstrate the advantage of the MI‐LASSO method compared with the alternatives. We also apply the MI‐LASSO method to the University of Michigan Dioxin Exposure Study to identify important circumstances and exposure factors that are associated with human serum dioxin concentration in Midland, Michigan. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

19.
Cost‐effectiveness analyses (CEA) conducted alongside randomised trials provide key evidence for informing healthcare decision making, but missing data pose substantive challenges. Recently, there have been a number of developments in methods and guidelines addressing missing data in trials. However, it is unclear whether these developments have permeated CEA practice. This paper critically reviews the extent of and methods used to address missing data in recently published trial‐based CEA. Issues of the Health Technology Assessment journal from 2013 to 2015 were searched. Fifty‐two eligible studies were identified. Missing data were very common; the median proportion of trial participants with complete cost‐effectiveness data was 63% (interquartile range: 47%–81%). The most common approach for the primary analysis was to restrict analysis to those with complete data (43%), followed by multiple imputation (30%). Half of the studies conducted some sort of sensitivity analyses, but only 2 (4%) considered possible departures from the missing‐at‐random assumption. Further improvements are needed to address missing data in cost‐effectiveness analyses conducted alongside randomised trials. These should focus on limiting the extent of missing data, choosing an appropriate method for the primary analysis that is valid under contextually plausible assumptions, and conducting sensitivity analyses to departures from the missing‐at‐random assumption.  相似文献   

20.
BACKGROUND AND OBJECTIVES: To illustrate the effects of different methods for handling missing data--complete case analysis, missing-indicator method, single imputation of unconditional and conditional mean, and multiple imputation (MI)--in the context of multivariable diagnostic research aiming to identify potential predictors (test results) that independently contribute to the prediction of disease presence or absence. METHODS: We used data from 398 subjects from a prospective study on the diagnosis of pulmonary embolism. Various diagnostic predictors or tests had (varying percentages of) missing values. Per method of handling these missing values, we fitted a diagnostic prediction model using multivariable logistic regression analysis. RESULTS: The receiver operating characteristic curve area for all diagnostic models was above 0.75. The predictors in the final models based on the complete case analysis, and after using the missing-indicator method, were very different compared to the other models. The models based on MI did not differ much from the models derived after using single conditional and unconditional mean imputation. CONCLUSION: In multivariable diagnostic research complete case analysis and the use of the missing-indicator method should be avoided, even when data are missing completely at random. MI methods are known to be superior to single imputation methods. For our example study, the single imputation methods performed equally well, but this was most likely because of the low overall number of missing values.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号