共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Baptiste Leurent Manuel Gomes Suzie Cro Nicola Wiles James R. Carpenter 《Health economics》2020,29(2):171-184
Missing data are a common issue in cost‐effectiveness analysis (CEA) alongside randomised trials and are often addressed assuming the data are ‘missing at random’. However, this assumption is often questionable, and sensitivity analyses are required to assess the implications of departures from missing at random. Reference‐based multiple imputation provides an attractive approach for conducting such sensitivity analyses, because missing data assumptions are framed in an intuitive way by making reference to other trial arms. For example, a plausible not at random mechanism in a placebo‐controlled trial would be to assume that participants in the experimental arm who dropped out stop taking their treatment and have similar outcomes to those in the placebo arm. Drawing on the increasing use of this approach in other areas, this paper aims to extend and illustrate the reference‐based multiple imputation approach in CEA. It introduces the principles of reference‐based imputation and proposes an extension to the CEA context. The method is illustrated in the CEA of the CoBalT trial evaluating cognitive behavioural therapy for treatment‐resistant depression. Stata code is provided. We find that reference‐based multiple imputation provides a relevant and accessible framework for assessing the robustness of CEA conclusions to different missing data assumptions. 相似文献
3.
Catherine A. Welch Irene Petersen Jonathan W. Bartlett Ian R. White Louise Marston Richard W. Morris Irwin Nazareth Kate Walters James Carpenter 《Statistics in medicine》2014,33(21):3725-3737
Most implementations of multiple imputation (MI) of missing data are designed for simple rectangular data structures ignoring temporal ordering of data. Therefore, when applying MI to longitudinal data with intermittent patterns of missing data, some alternative strategies must be considered. One approach is to divide data into time blocks and implement MI independently at each block. An alternative approach is to include all time blocks in the same MI model. With increasing numbers of time blocks, this approach is likely to break down because of co‐linearity and over‐fitting. The new two‐fold fully conditional specification (FCS) MI algorithm addresses these issues, by only conditioning on measurements, which are local in time. We describe and report the results of a novel simulation study to critically evaluate the two‐fold FCS algorithm and its suitability for imputation of longitudinal electronic health records. After generating a full data set, approximately 70% of selected continuous and categorical variables were made missing completely at random in each of ten time blocks. Subsequently, we applied a simple time‐to‐event model. We compared efficiency of estimated coefficients from a complete records analysis, MI of data in the baseline time block and the two‐fold FCS algorithm. The results show that the two‐fold FCS algorithm maximises the use of data available, with the gain relative to baseline MI depending on the strength of correlations within and between variables. Using this approach also increases plausibility of the missing at random assumption by using repeated measures over time of variables whose baseline values may be missing. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. 相似文献
4.
Tim P. Morris Ian R. White Patrick Royston Shaun R. Seaman Angela M. Wood 《Statistics in medicine》2014,33(1):88-104
We are concerned with multiple imputation of the ratio of two variables, which is to be used as a covariate in a regression analysis. If the numerator and denominator are not missing simultaneously, it seems sensible to make use of the observed variable in the imputation model. One such strategy is to impute missing values for the numerator and denominator, or the log‐transformed numerator and denominator, and then calculate the ratio of interest; we call this ‘passive’ imputation. Alternatively, missing ratio values might be imputed directly, with or without the numerator and/or the denominator in the imputation model; we call this ‘active’ imputation. In two motivating datasets, one involving body mass index as a covariate and the other involving the ratio of total to high‐density lipoprotein cholesterol, we assess the sensitivity of results to the choice of imputation model and, as an alternative, explore fully Bayesian joint models for the outcome and incomplete ratio. Fully Bayesian approaches using Winbugs were unusable in both datasets because of computational problems. In our first dataset, multiple imputation results are similar regardless of the imputation model; in the second, results are sensitive to the choice of imputation model. Sensitivity depends strongly on the coefficient of variation of the ratio's denominator. A simulation study demonstrates that passive imputation without transformation is risky because it can lead to downward bias when the coefficient of variation of the ratio's denominator is larger than about 0.1. Active imputation or passive imputation after log‐transformation is preferable. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. 相似文献
5.
When missing data occur in one or more covariates in a regression model, multiple imputation (MI) is widely advocated as an improvement over complete‐case analysis (CC). We use theoretical arguments and simulation studies to compare these methods with MI implemented under a missing at random assumption. When data are missing completely at random, both methods have negligible bias, and MI is more efficient than CC across a wide range of scenarios. For other missing data mechanisms, bias arises in one or both methods. In our simulation setting, CC is biased towards the null when data are missing at random. However, when missingness is independent of the outcome given the covariates, CC has negligible bias and MI is biased away from the null. With more general missing data mechanisms, bias tends to be smaller for MI than for CC. Since MI is not always better than CC for missing covariate problems, the choice of method should take into account what is known about the missing data mechanism in a particular substantive application. Importantly, the choice of method should not be based on comparison of standard errors. We propose new ways to understand empirical differences between MI and CC, which may provide insights into the appropriateness of the assumptions underlying each method, and we propose a new index for assessing the likely gain in precision from MI: the fraction of incomplete cases among the observed values of a covariate (FICO). Copyright © 2010 John Wiley & Sons, Ltd. 相似文献
6.
Jacques‐Emmanuel Galimard Sylvie Chevret Camelia Protopopescu Matthieu Resche‐Rigon 《Statistics in medicine》2016,35(17):2907-2920
Standard implementations of multiple imputation (MI) approaches provide unbiased inferences based on an assumption of underlying missing at random (MAR) mechanisms. However, in the presence of missing data generated by missing not at random (MNAR) mechanisms, MI is not satisfactory. Originating in an econometric statistical context, Heckman's model, also called the sample selection method, deals with selected samples using two joined linear equations, termed the selection equation and the outcome equation. It has been successfully applied to MNAR outcomes. Nevertheless, such a method only addresses missing outcomes, and this is a strong limitation in clinical epidemiology settings, where covariates are also often missing. We propose to extend the validity of MI to some MNAR mechanisms through the use of the Heckman's model as imputation model and a two‐step estimation process. This approach will provide a solution that can be used in an MI by chained equation framework to impute missing (either outcomes or covariates) data resulting either from a MAR or an MNAR mechanism when the MNAR mechanism is compatible with a Heckman's model. The approach is illustrated on a real dataset from a randomised trial in patients with seasonal influenza. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献
7.
Michelle Shardell Gregory E. Hicks Ram R. Miller Jay Magaziner 《Statistics in medicine》2010,29(22):2282-2296
We propose a semiparametric marginal modeling approach for longitudinal analysis of cohorts with data missing due to death and non‐response to estimate regression parameters interpreted as conditioned on being alive. Our proposed method accommodates outcomes and time‐dependent covariates that are missing not at random with non‐monotone missingness patterns via inverse‐probability weighting. Missing covariates are replaced by consistent estimates derived from a simultaneously solved inverse‐probability‐weighted estimating equation. Thus, we utilize data points with the observed outcomes and missing covariates beyond the estimated weights while avoiding numerical methods to integrate over missing covariates. The approach is applied to a cohort of elderly female hip fracture patients to estimate the prevalence of walking disability over time as a function of body composition, inflammation, and age. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献
8.
Missingness mechanism is in theory unverifiable based only on observed data. If there is a suspicion of missing not at random, researchers often perform a sensitivity analysis to evaluate the impact of various missingness mechanisms. In general, sensitivity analysis approaches require a full specification of the relationship between missing values and missingness probabilities. Such relationship can be specified based on a selection model, a pattern-mixture model or a shared parameter model. Under the selection modeling framework, we propose a sensitivity analysis approach using a nonparametric multiple imputation strategy. The proposed approach only requires specifying the correlation coefficient between missing values and selection (response) probabilities under a selection model. The correlation coefficient is a standardized measure and can be used as a natural sensitivity analysis parameter. The sensitivity analysis involves multiple imputations of missing values, yet the sensitivity parameter is only used to select imputing/donor sets. Hence, the proposed approach might be more robust against misspecifications of the sensitivity parameter. For illustration, the proposed approach is applied to incomplete measurements of level of preoperative Hemoglobin A1c, for patients who had high-grade carotid artery stenosisa and were scheduled for surgery. A simulation study is conducted to evaluate the performance of the proposed approach. 相似文献
9.
Linear regression is one of the most popular statistical techniques. In linear regression analysis, missing covariate data occur often. A recent approach to analyse such data is a weighted estimating equation. With weighted estimating equations, the contribution to the estimating equation from a complete observation is weighted by the inverse 'probability of being observed'. In this paper, we propose a weighted estimating equation in which we wrongly assume that the missing covariates are multivariate normal, but still produces consistent estimates as long as the probability of being observed is correctly modelled. In simulations, these weighted estimating equations appear to be highly efficient when compared to the most efficient weighted estimating equation as proposed by Robins et al. and Lipsitz et al. However, these weighted estimating equations, in which we wrongly assume that the missing covariates are multivariate normal, are much less computationally intensive than the weighted estimating equations given by Lipsitz et al. We compare the weighted estimating equations proposed in this paper to the efficient weighted estimating equations via an example and a simulation study. We only consider missing data which are missing at random; non-ignorably missing data are not addressed in this paper. 相似文献
10.
Suzie
Cro Tim P. Morris Michael G. Kenward James R. Carpenter 《Statistics in medicine》2020,39(21):2815-2842
Missing data due to loss to follow-up or intercurrent events are unintended, but unfortunately inevitable in clinical trials. Since the true values of missing data are never known, it is necessary to assess the impact of untestable and unavoidable assumptions about any unobserved data in sensitivity analysis. This tutorial provides an overview of controlled multiple imputation (MI) techniques and a practical guide to their use for sensitivity analysis of trials with missing continuous outcome data. These include δ- and reference-based MI procedures. In δ-based imputation, an offset term, δ, is typically added to the expected value of the missing data to assess the impact of unobserved participants having a worse or better response than those observed. Reference-based imputation draws imputed values with some reference to observed data in other groups of the trial, typically in other treatment arms. We illustrate the accessibility of these methods using data from a pediatric eczema trial and a chronic headache trial and provide Stata code to facilitate adoption. We discuss issues surrounding the choice of δ in δ-based sensitivity analysis. We also review the debate on variance estimation within reference-based analysis and justify the use of Rubin's variance estimator in this setting, since as we further elaborate on within, it provides information anchored inference. 相似文献
11.
Missing covariate data present a challenge to tree-structured methodology due to the fact that a single tree model, as opposed to an estimated parameter value, may be desired for use in a clinical setting. To address this problem, we suggest a multiple imputation algorithm that adds draws of stochastic error to a tree-based single imputation method presented by Conversano and Siciliano (Technical Report, University of Naples, 2003). Unlike previously proposed techniques for accommodating missing covariate data in tree-structured analyses, our methodology allows the modeling of complex and nonlinear covariate structures while still resulting in a single tree model. We perform a simulation study to evaluate our stochastic multiple imputation algorithm when covariate data are missing at random and compare it to other currently used methods. Our algorithm is advantageous for identifying the true underlying covariate structure when complex data and larger percentages of missing covariate observations are present. It is competitive with other current methods with respect to prediction accuracy. To illustrate our algorithm, we create a tree-structured survival model for predicting time to treatment response in older, depressed adults. 相似文献
12.
Robustness of a multivariate normal approximation for imputation of incomplete binary data 总被引:2,自引:0,他引:2
Multiple imputation has become easier to perform with the advent of several software packages that provide imputations under a multivariate normal model, but imputation of missing binary data remains an important practical problem. Here, we explore three alternative methods for converting a multivariate normal imputed value into a binary imputed value: (1) simple rounding of the imputed value to the nearer of 0 or 1, (2) a Bernoulli draw based on a 'coin flip' where an imputed value between 0 and 1 is treated as the probability of drawing a 1, and (3) an adaptive rounding scheme where the cut-off value for determining whether to round to 0 or 1 is based on a normal approximation to the binomial distribution, making use of the marginal proportions of 0's and 1's on the variable. We perform simulation studies on a data set of 206,802 respondents to the California Healthy Kids Survey, where the fully observed data on 198,262 individuals defines the population, from which we repeatedly draw samples with missing data, impute, calculate statistics and confidence intervals, and compare bias and coverage against the true values. Frequently, we found satisfactory bias and coverage properties, suggesting that approaches such as these that are based on statistical approximations are preferable in applied research to either avoiding settings where missing data occur or relying on complete-case analyses. Considering both the occurrence and extent of deficits in coverage, we found that adaptive rounding provided the best performance. 相似文献
13.
Multiple imputation (MI) is a technique that can be used for handling missing data in a public-use dataset. With MI, two or more completed versions of the dataset are created, containing possibly different but reasonable replacements for the missing data. Users analyse the completed datasets separately with standard techniques and then combine the results using simple formulae in a way that allows the extra uncertainty due to missing data to be assessed. An advantage of this approach is that the resulting public-use data can be analysed by a variety of users for a variety of purposes, without each user needing to devise a method to deal with the missing data. A recent example for a large public-use dataset is the MI of the family income and personal earnings variables in the National Health Interview Survey. We propose an approach to utilise MI to handle the problems of missing gestational ages and implausible birthweight–gestational age combinations in national vital statistics datasets. This paper describes MI and gives examples of MI for public-use datasets, summarises methods that have been used for identifying implausible gestational age values on birth records, and combines these ideas by setting forth scenarios for identifying and then imputing missing and implausible gestational age values multiple times. Because missing and implausible gestational age values are not missing completely at random, using multiple imputations and, thus, incorporating both the existing relationships among the variables and the uncertainty added from the imputation, may lead to more valid inferences in some analytical studies than simply excluding birth records with inadequate data. 相似文献
14.
Kaifeng Lu 《Statistics in medicine》2014,33(7):1134-1145
Pattern‐mixture models provide a general and flexible framework for sensitivity analyses of nonignorable missing data. The placebo‐based pattern‐mixture model (Little and Yau, Biometrics 1996; 52 :1324–1333) treats missing data in a transparent and clinically interpretable manner and has been used as sensitivity analysis for monotone missing data in longitudinal studies. The standard multiple imputation approach (Rubin, Multiple Imputation for Nonresponse in Surveys, 1987) is often used to implement the placebo‐based pattern‐mixture model. We show that Rubin's variance estimate of the multiple imputation estimator of treatment effect can be overly conservative in this setting. As an alternative to multiple imputation, we derive an analytic expression of the treatment effect for the placebo‐based pattern‐mixture model and propose a posterior simulation or delta method for the inference about the treatment effect. Simulation studies demonstrate that the proposed methods provide consistent variance estimates and outperform the imputation methods in terms of power for the placebo‐based pattern‐mixture model. We illustrate the methods using data from a clinical study of major depressive disorders. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献
15.
Juned Siddique Ofer Harel Catherine M. Crespi Donald Hedeker 《Statistics in medicine》2014,33(17):3013-3028
The true missing data mechanism is never known in practice. We present a method for generating multiple imputations for binary variables, which formally incorporates missing data mechanism uncertainty. Imputations are generated from a distribution of imputation models rather than a single model, with the distribution reflecting subjective notions of missing data mechanism uncertainty. Parameter estimates and standard errors are obtained using rules for nested multiple imputation. Using simulation, we investigate the impact of missing data mechanism uncertainty on post‐imputation inferences and show that incorporating this uncertainty can increase the coverage of parameter estimates. We apply our method to a longitudinal smoking cessation trial where nonignorably missing data were a concern. Our method provides a simple approach for formalizing subjective notions regarding nonresponse and can be implemented using existing imputation software. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献
16.
We propose a propensity score-based multiple imputation (MI) method to tackle incomplete missing data resulting from drop-outs and/or intermittent skipped visits in longitudinal clinical trials with binary responses. The estimation and inferential properties of the proposed method are contrasted via simulation with those of the commonly used complete-case (CC) and generalized estimating equations (GEE) methods. Three key results are noted. First, if data are missing completely at random, MI can be notably more efficient than the CC and GEE methods. Second, with small samples, GEE often fails due to 'convergence problems', but MI is free of that problem. Finally, if the data are missing at random, while the CC and GEE methods yield results with moderate to large bias, MI generally yields results with negligible bias. A numerical example with real data is provided for illustration. 相似文献
17.
In this paper, we consider fitting semiparametric additive hazards models for case‐cohort studies using a multiple imputation approach. In a case‐cohort study, main exposure variables are measured only on some selected subjects, but other covariates are often available for the whole cohort. We consider this as a special case of a missing covariate by design. We propose to employ a popular incomplete data method, multiple imputation, for estimation of the regression parameters in additive hazards models. For imputation models, an imputation modeling procedure based on a rejection sampling is developed. A simple imputation modeling that can naturally be applied to a general missing‐at‐random situation is also considered and compared with the rejection sampling method via extensive simulation studies. In addition, a misspecification aspect in imputation modeling is investigated. The proposed procedures are illustrated using a cancer data example. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献
18.
Evaluating model‐based imputation methods for missing covariates in regression models with interactions 下载免费PDF全文
Imputation strategies are widely used in settings that involve inference with incomplete data. However, implementation of a particular approach always rests on assumptions, and subtle distinctions between methods can have an impact on subsequent analyses. In this research article, we are concerned with regression models in which the true underlying relationship includes interaction terms. We focus in particular on a linear model with one fully observed continuous predictor, a second partially observed continuous predictor, and their interaction. We derive the conditional distribution of the missing covariate and interaction term given the observed covariate and the outcome variable, and examine the performance of a multiple imputation procedure based on this distribution. We also investigate several alternative procedures that can be implemented by adapting multivariate normal multiple imputation software in ways that might be expected to perform well despite incompatibilities between model assumptions and true underlying relationships among the variables. The methods are compared in terms of bias, coverage, and CI width. As expected, the procedure based on the correct conditional distribution performs well across all scenarios. Just as importantly for general practitioners, several of the approaches based on multivariate normality perform comparably with the correct conditional distribution in a number of circumstances, although interestingly, procedures that seek to preserve the multiplicative relationship between the interaction term and the main‐effects are found to be substantially less reliable. For illustration, the various procedures are applied to an analysis of post‐traumatic stress disorder symptoms in a study of childhood trauma. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献
19.
Addressing Missing Data in Patient‐Reported Outcome Measures (PROMS): Implications for the Use of PROMS for Comparing Provider Performance 下载免费PDF全文
Patient‐reported outcome measures (PROMs) are now routinely collected in the English National Health Service and used to compare and reward hospital performance within a high‐powered pay‐for‐performance scheme. However, PROMs are prone to missing data. For example, hospitals often fail to administer the pre‐operative questionnaire at hospital admission, or patients may refuse to participate or fail to return their post‐operative questionnaire. A key concern with missing PROMs is that the individuals with complete information tend to be an unrepresentative sample of patients within each provider and inferences based on the complete cases will be misleading. This study proposes a strategy for addressing missing data in the English PROM survey using multiple imputation techniques and investigates its impact on assessing provider performance. We find that inferences about relative provider performance are sensitive to the assumptions made about the reasons for the missing data. © 2015 The Authors. Health Economics Published by John Wiley & Sons Ltd. 相似文献
20.
Tim P. Morris Ian R. White James R. Carpenter Simon J. Stanworth Patrick Royston 《Statistics in medicine》2015,34(25):3298-3317
Multivariable fractional polynomial (MFP) models are commonly used in medical research. The datasets in which MFP models are applied often contain covariates with missing values. To handle the missing values, we describe methods for combining multiple imputation with MFP modelling, considering in turn three issues: first, how to impute so that the imputation model does not favour certain fractional polynomial (FP) models over others; second, how to estimate the FP exponents in multiply imputed data; and third, how to choose between models of differing complexity. Two imputation methods are outlined for different settings. For model selection, methods based on Wald‐type statistics and weighted likelihood‐ratio tests are proposed and evaluated in simulation studies. The Wald‐based method is very slightly better at estimating FP exponents. Type I error rates are very similar for both methods, although slightly less well controlled than analysis of complete records; however, there is potential for substantial gains in power over the analysis of complete records. We illustrate the two methods in a dataset from five trauma registries for which a prognostic model has previously been published, contrasting the selected models with that obtained by analysing the complete records only. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. 相似文献