首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
There is often a need to assess the dependence of standard analyses on the strong untestable assumption of ignorable missingness. To tackle this problem, past research developed simple sensitivity index measures assuming a linear impact of nonignorability and missingness in outcomes only. These restrictions limit their applicability for studies with missingness in both outcome and covariates. Nonignorable missingness in this setting poses significant new analytic challenges and calls for more general and flexible methods that are also computationally tractable even for large datasets. In this paper, we relax the restrictions of extant linear sensitivity index methods and develop nonlinear sensitivity indices that maintain computational simplicity and perform equally well when the impact of nonignorability is locally linear. On the other hand, they can substantially improve the effectiveness of local sensitivity analysis when regression outcomes and covariates are subject to concurrent missingness. In this situation, the local linear sensitivity analysis fails to detect the impact of nonignorability while the proposed nonlinear sensitivity measures can. Because the new sensitivity indices avoid fitting complicated nonignorable models, they are computationally tractable (i.e., scalable) for use in large datasets. We develop general formula for nonlinear sensitivity index measures, and evaluate the new measures in simulated data and a real dataset collected using the ecological momentary assessment method. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

2.
Nonignorable missing data poses key challenges for estimating treatment effects because the substantive model may not be identifiable without imposing further assumptions. For example, the Heckman selection model has been widely used for handling nonignorable missing data but requires the study to make correct assumptions, both about the joint distribution of the missingness and outcome and that there is a valid exclusion restriction. Recent studies have revisited how alternative selection model approaches, for example estimated by multiple imputation (MI) and maximum likelihood, relate to Heckman-type approaches in addressing the first hurdle. However, the extent to which these different selection models rely on the exclusion restriction assumption with nonignorable missing data is unclear. Motivated by an interventional study (REFLUX) with nonignorable missing outcome data in half of the sample, this article critically examines the role of the exclusion restriction in Heckman, MI, and full-likelihood selection models when addressing nonignorability. We explore the implications of the different methodological choices concerning the exclusion restriction for relative bias and root-mean-squared error in estimating treatment effects. We find that the relative performance of the methods differs in practically important ways according to the relevance and strength of the exclusion restriction. The full-likelihood approach is less sensitive to alternative assumptions about the exclusion restriction than Heckman-type models and appears an appropriate method for handling nonignorable missing data. We illustrate the implications of method choice for inference in the REFLUX study, which evaluates the effect of laparoscopic surgery on long-term quality of life for patients with gastro-oseophageal reflux disease.  相似文献   

3.
In longitudinal studies with potentially nonignorable drop-out, one can assess the likely effect of the nonignorability in a sensitivity analysis. Troxel et al. proposed a general index of sensitivity to nonignorability, or ISNI, to measure sensitivity of key inferences in a neighbourhood of the ignorable, missing at random (MAR) model. They derived detailed formulas for ISNI in the special case of the generalized linear model with a potentially missing univariate outcome. In this paper, we extend the method to longitudinal modelling. We use a multivariate normal model for the outcomes and a regression model for the drop-out process, allowing missingness probabilities to depend on an unobserved response. The computation is straightforward, and merely involves estimating a mixed-effects model and a selection model for the drop-out, together with some simple arithmetic calculations. We illustrate the method with three examples.  相似文献   

4.
In this paper we consider longitudinal studies in which the outcome to be measured over time is binary, and the covariates of interest are categorical. In longitudinal studies it is common for the outcomes and any time-varying covariates to be missing due to missed study visits, resulting in non-monotone patterns of missingness. Moreover, the reasons for missed visits may be related to the specific values of the response and/or covariates that should have been obtained, i.e. missingness is non-ignorable. With non-monotone non-ignorable missing response and covariate data, a full likelihood approach is quite complicated, and maximum likelihood estimation can be computationally prohibitive when there are many occasions of follow-up. Furthermore, the full likelihood must be correctly specified to obtain consistent parameter estimates. We propose a pseudo-likelihood method for jointly estimating the covariate effects on the marginal probabilities of the outcomes and the parameters of the missing data mechanism. The pseudo-likelihood requires specification of the marginal distributions of the missingness indicator, outcome, and possibly missing covariates at each occasions, but avoids making assumptions about the joint distribution of the data at two or more occasions. Thus, the proposed method can be considered semi-parametric. The proposed method is an extension of the pseudo-likelihood approach in Troxel et al. to handle binary responses and possibly missing time-varying covariates. The method is illustrated using data from the Six Cities study, a longitudinal study of the health effects of air pollution.  相似文献   

5.
We study the problem of estimation and inference on the average treatment effect in a smoking cessation trial where an outcome and some auxiliary information were measured longitudinally, and both were subject to missing values. Dynamic generalized linear mixed effects models linking the outcome, the auxiliary information, and the covariates are proposed. The maximum likelihood approach is applied to the estimation and inference on the model parameters. The average treatment effect is estimated by the G‐computation approach, and the sensitivity of the treatment effect estimate to the nonignorable missing data mechanisms is investigated through the local sensitivity analysis approach. The proposed approach can handle missing data that form arbitrary missing patterns over time. We applied the proposed method to the analysis of the smoking cessation trial. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

6.
Many clinical or prevention studies involve missing or censored outcomes. Maximum likelihood (ML) methods provide a conceptually straightforward approach to estimation when the outcome is partially missing. Methods of implementing ML methods range from the simple to the complex, depending on the type of data and the missing-data mechanism. Simple ML methods for ignorable missing-data mechanisms (when data are missing at random) include complete-case analysis, complete-case analysis with covariate adjustment, survival analysis with covariate adjustment, and analysis via propensity-to-be-missing scores. More complex ML methods for ignorable missing-data mechanisms include the analysis of longitudinal dropouts via a marginal model for continuous data or a conditional model for categorical data. A moderately complex ML method for categorical data with a saturated model and either ignorable or nonignorable missing-data mechanisms is a perfect fit analysis, an algebraic method involving closed-form estimates and variances. A complex and flexible ML method with categorical data and either ignorable or nonignorable missing-data mechanisms is the method of composite linear models, a matrix method requiring specialized software. Except for the method of composite linear models, which can involve challenging matrix specifications, the implementation of these ML methods ranges in difficulty from easy to moderate.  相似文献   

7.
In longitudinal studies, subjects may be lost to follow up and, thus, present incomplete response sequences. When the mechanism underlying the dropout is nonignorable, we need to account for dependence between the longitudinal and the dropout process. We propose to model such a dependence through discrete latent effects, which are outcome‐specific and account for heterogeneity in the univariate profiles. Dependence between profiles is introduced by using a probability matrix to describe the corresponding joint distribution. In this way, we separately model dependence within each outcome and dependence between outcomes. The major feature of this proposal, when compared with standard finite mixture models, is that it allows the nonignorable dropout model to properly nest its ignorable counterpart. We also discuss the use of an index of (local) sensitivity to nonignorability to investigate the effects that assumptions about the dropout process may have on model parameter estimates. The proposal is illustrated via the analysis of data from a longitudinal study on the dynamics of cognitive functioning in the elderly.  相似文献   

8.
Missing outcome data are commonly encountered in randomized controlled trials and hence may need to be addressed in a meta‐analysis of multiple trials. A common and simple approach to deal with missing data is to restrict analysis to individuals for whom the outcome was obtained (complete case analysis). However, estimated treatment effects from complete case analyses are potentially biased if informative missing data are ignored. We develop methods for estimating meta‐analytic summary treatment effects for continuous outcomes in the presence of missing data for some of the individuals within the trials. We build on a method previously developed for binary outcomes, which quantifies the degree of departure from a missing at random assumption via the informative missingness odds ratio. Our new model quantifies the degree of departure from missing at random using either an informative missingness difference of means or an informative missingness ratio of means, both of which relate the mean value of the missing outcome data to that of the observed data. We propose estimating the treatment effects, adjusted for informative missingness, and their standard errors by a Taylor series approximation and by a Monte Carlo method. We apply the methodology to examples of both pairwise and network meta‐analysis with multi‐arm trials. © 2014 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.  相似文献   

9.
We present a case study in the analysis of the prognostic effects of anaemia and other covariates on the local recurrence of head and neck cancer in patients who have been treated with radiation therapy. Because it is believed that a large fraction of the patients are cured by the therapy, we use a failure time mixture model for the outcomes, which simultaneously models both the relationship of the covariates to cure and the relationship of the covariates to local recurrence times for subjects who are not cured. A problematic feature of the data is that two covariates of interest having missing values, so that only 75 per cent of the subjects have complete data. We handle the missing-data problem by jointly modelling the covariates and the outcomes, and then fitting the model to all of the data, including the incomplete cases. We compare our approach to two traditional methods for handling missingness, that is, complete-case analysis and the use of an indicator variable for missingness. The comparison with complete-case analysis demonstrates gains in efficiency for joint modelling as well as sensitivity of some results to the method used to handle missing data. The use of an indicator variable yields results that are very similar to those from joint modelling for our data. We also compare the results obtained for the mixture model with results obtained for a standard (non-mixture) survival model. It is seen that the mixture model separates out effects in a way that is not possible with a standard survival model. In particular, conditional on other covariates, we find strong evidence of an association between anaemia and cure, whereas the evidence of an association between anaemia and time to local recurrence for patients who are not cured is weaker.  相似文献   

10.
Quality-of-life (QOL) is an important outcome in clinical research, particularly in cancer clinical trials. Typically, data are collected longitudinally from patients during treatment and subsequent follow-up. Missing data are a common problem, and missingness may arise in a non-ignorable fashion. In particular, the probability that a patient misses an assessment may depend on the patient's QOL at the time of the scheduled assessment. We propose a Markov chain model for the analysis of categorical outcomes derived from QOL measures. Our model assumes that transitions between QOL states depend on covariates through generalized logit models or proportional odds models. To account for non-ignorable missingness, we incorporate logistic regression models for the conditional probabilities of observing measurements, given their actual values. The model can accommodate time-dependent covariates. Estimation is by maximum likelihood, summing over all possible values of the missing measurements. We describe options for selecting parsimonious models, and we study the finite-sample properties of the estimators by simulation. We apply the techniques to data from a breast cancer clinical trial in which QOL assessments were made longitudinally, and in which missing data frequently arose.  相似文献   

11.
We propose a semiparametric marginal modeling approach for longitudinal analysis of cohorts with data missing due to death and non‐response to estimate regression parameters interpreted as conditioned on being alive. Our proposed method accommodates outcomes and time‐dependent covariates that are missing not at random with non‐monotone missingness patterns via inverse‐probability weighting. Missing covariates are replaced by consistent estimates derived from a simultaneously solved inverse‐probability‐weighted estimating equation. Thus, we utilize data points with the observed outcomes and missing covariates beyond the estimated weights while avoiding numerical methods to integrate over missing covariates. The approach is applied to a cohort of elderly female hip fracture patients to estimate the prevalence of walking disability over time as a function of body composition, inflammation, and age. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

12.
Missing covariates in regression analysis are a pervasive problem in medical, social, and economic researches. We study empirical-likelihood confidence regions for unconstrained and constrained regression parameters in a nonignorable covariate-missing data problem. For an assumed conditional mean regression model, we assume that some covariates are fully observed but other covariates are missing for some subjects. By exploitation of a probability model of missingness and a working conditional score model from a semiparametric perspective, we build a system of unbiased estimating equations, where the number of equations exceeds the number of unknown parameters. Based on the proposed estimating equations, we introduce unconstrained and constrained empirical-likelihood ratio statistics to construct empirical-likelihood confidence regions for the underlying regression parameters without and with constraints. We establish the asymptotic distributions of the proposed empirical-likelihood ratio statistics. Simulation results show that the proposed empirical-likelihood methods have a better finite-sample performance than other competitors in terms of coverage probability and interval length. Finally, we apply the proposed empirical-likelihood methods to the analysis of a data set from the US National Health and Nutrition Examination Survey.  相似文献   

13.
Pattern‐mixture models provide a general and flexible framework for sensitivity analyses of nonignorable missing data in longitudinal studies. The delta‐adjusted pattern‐mixture models handle missing data in a clinically interpretable manner and have been used as sensitivity analyses addressing the effectiveness hypothesis, while a likelihood‐based approach that assumes data are missing at random is often used as the primary analysis addressing the efficacy hypothesis. We describe a method for power calculations for delta‐adjusted pattern‐mixture model sensitivity analyses in confirmatory clinical trials. To apply the method, we only need to specify the pattern probabilities at postbaseline time points, the expected treatment differences at postbaseline time points, the conditional covariance matrix of postbaseline measurements given the baseline measurement, and the delta‐adjustment method for the pattern‐mixture model. We use an example to illustrate and compare various delta‐adjusted pattern‐mixture models and use simulations to confirm the analytic results. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

14.
In clinical settings, missing data in the covariates occur frequently. For example, some markers are expensive or hard to measure. When this sort of data is used for model selection, the missingness is often resolved through a complete case analysis or a form of single imputation. An alternative sometimes comes in the form of leaving the most damaged covariates out. All these strategies jeopardise the goal of model selection. In earlier work, we have applied the logistic Lasso in combination with multiple imputation to obtain results in such settings, but we only provided heuristic arguments to advocate the method. In this paper, we propose an improved method that builds on firm statistical arguments and that is developed along the lines of the stochastic expectation–maximisation algorithm. We show that our method can be used to handle missing data in both categorical and continuous predictors, as well as in a nonpenalised regression. We demonstrate the method by applying it to data of 273 lung cancer patients. The objective is to select a model for the prediction of acute dysphagia, starting from a large set of potential predictors, including clinical and treatment covariates as well as a set of single‐nucleotide polymorphisms. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

15.
Missing outcome data is a crucial threat to the validity of treatment effect estimates from randomized trials. The outcome distributions of participants with missing and observed data are often different, which increases bias. Causal inference methods may aid in reducing the bias and improving efficiency by incorporating baseline variables into the analysis. In particular, doubly robust estimators incorporate 2 nuisance parameters: the outcome regression and the missingness mechanism (ie, the probability of missingness conditional on treatment assignment and baseline variables), to adjust for differences in the observed and unobserved groups that can be explained by observed covariates. To consistently estimate the treatment effect, one of these nuisance parameters must be consistently estimated. Traditionally, nuisance parameters are estimated using parametric models, which often precludes consistency, particularly in moderate to high dimensions. Recent research on missing data has focused on data‐adaptive estimation to help achieve consistency, but the large sample properties of such methods are poorly understood. In this article, we discuss a doubly robust estimator that is consistent and asymptotically normal under data‐adaptive estimation of the nuisance parameters. We provide a formula for an asymptotically exact confidence interval under minimal assumptions. We show that our proposed estimator has smaller finite‐sample bias compared to standard doubly robust estimators. We present a simulation study demonstrating the enhanced performance of our estimators in terms of bias, efficiency, and coverage of the confidence intervals. We present the results of an illustrative example: a randomized, double‐blind phase 2/3 trial of antiretroviral therapy in HIV‐infected persons.  相似文献   

16.
An efficient monotone data augmentation (MDA) algorithm is proposed for missing data imputation for incomplete multivariate nonnormal data that may contain variables of different types and are modeled by a sequence of regression models including the linear, binary logistic, multinomial logistic, proportional odds, Poisson, negative binomial, skew-normal, skew-t regressions, or a mixture of these models. The MDA algorithm is applied to the sensitivity analyses of longitudinal trials with nonignorable dropout using the controlled pattern imputations that assume the treatment effect reduces or disappears after subjects in the experimental arm discontinue the treatment. We also describe a heuristic approach to implement the controlled imputation, in which the fully conditional specification method is used to impute the intermediate missing data to create a monotone missing pattern, and the missing data after dropout are then imputed according to the assumed nonignorable mechanisms. The proposed methods are illustrated by simulation and real data analyses. Sample SAS code for the analyses is provided in the supporting information  相似文献   

17.
Trauma is a term used in medicine for describing physical injury. The prospective evaluation of the care of injured patients aims to improve the management of a trauma system and acts as an ongoing audit of trauma care. One of the principal techniques used to evaluate the effectiveness of trauma care at different hospitals is through a comparative outcome analysis. In such an analysis, a national 'league table' can be compiled to determine which hospitals are better at managing trauma care. One of the problems with the conventional analysis is that key covariates for measuring physiological injury can often be missing. It is also hypothesized that this missingness is not missing at random (NMAR). We describe the methods used to assess the performance of hospitals in a trauma setting and implement the method of weights for generalized linear models to account for the missing covariate data, when we suspect the missing data mechanism is NMAR using a Monte Carlo EM algorithm. Through simulation work and application to the trauma data we demonstrate the affect the missing covariate data can have on the performance of hospitals and how the conclusions we draw from the analysis can differ. We highlight the differences in hospital performance and the ranking of hospitals.  相似文献   

18.
Li J  Yang X  Wu Y  Shoptaw S 《Statistics in medicine》2007,26(12):2519-2532
In biomedical research with longitudinal designs, missing values due to intermittent non-response or premature withdrawal are usually 'non-ignorable' in the sense that unobserved values are related to the patterns of missingness. By drawing the framework of a shared-parameter mechanism, the process yielding the repeated count measures and the process yielding missing values can be modelled separately, conditionally on a group of shared parameters. For chronic diseases, Markov transition models can be used to study the transitional features of the pathologic processes. In this paper, Markov Chain Monte Carlo algorithms are developed to fit a random-effects Markov transition model for incomplete count repeated measures, within which random effects are shared by the counting process and the missing-data mechanism. Assuming a Poisson distribution for the count measures, the transition probabilities are estimated using a Poisson regression model. The missingness mechanism is modelled with a multinomial-logit regression to calculate the transition probabilities of the missingness indicators. The method is demonstrated using both simulated data sets and a practical data set from a smoking cessation clinical trial.  相似文献   

19.
We studied bias due to missing exposure data in the proportional hazards regression model when using complete-case analysis (CCA). Eleven missing data scenarios were considered: one with missing completely at random (MCAR), four missing at random (MAR), and six non-ignorable missingness scenarios, with a variety of hazard ratios, censoring fractions, missingness fractions and sample sizes. When missingness was MCAR or dependent only on the exposure, there was negligible bias (2-3 per cent) that was similar to the difference between the estimate in the full data set with no missing data and the true parameter. In contrast, substantial bias occurred when missingness was dependent on outcome or both outcome and exposure. For models with hazard ratio of 3.5, a sample size of 400, 20 per cent censoring and 40 per cent missing data, the relative bias for the hazard ratio ranged between 7 per cent and 64 per cent. We observed important differences in the direction and magnitude of biases under the various missing data mechanisms. For example, in scenarios where missingness was associated with longer or shorter follow-up, the biases were notably different, although both mechanisms are MAR. The hazard ratio was underestimated (with larger bias) when missingness was associated with longer follow-up and overestimated (with smaller bias) when associated with shorter follow-up. If it is known that missingness is associated with a less frequently observed outcome or with both the outcome and exposure, CCA may result in an invalid inference and other methods for handling missing data should be considered.  相似文献   

20.
Propensity score models are frequently used to estimate causal effects in observational studies. One unresolved issue in fitting these models is handling missing values in the propensity score model covariates. As these models usually contain a large set of covariates, using only individuals with complete data significantly decreases the sample size and statistical power. Several missing data imputation approaches have been proposed, including multiple imputation (MI), MI with missingness pattern (MIMP), and treatment mean imputation. Generalized boosted modeling (GBM), which is a nonparametric approach to estimate propensity scores, can automatically handle missingness in the covariates. Although the performance of MI, MIMP, and treatment mean imputation have previously been compared for binary treatments, they have not been compared for continuous exposures or with single imputation and GBM. We compared these approaches in estimating the generalized propensity score (GPS) for a continuous exposure in both a simulation study and in empirical data. Using GBM with the incomplete data to estimate the GPS did not perform well in the simulation. Missing values should be imputed before estimating propensity scores using GBM or any other approach for estimating the GPS.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号