共查询到20条相似文献,搜索用时 15 毫秒
1.
Catherine A. Welch Irene Petersen Jonathan W. Bartlett Ian R. White Louise Marston Richard W. Morris Irwin Nazareth Kate Walters James Carpenter 《Statistics in medicine》2014,33(21):3725-3737
Most implementations of multiple imputation (MI) of missing data are designed for simple rectangular data structures ignoring temporal ordering of data. Therefore, when applying MI to longitudinal data with intermittent patterns of missing data, some alternative strategies must be considered. One approach is to divide data into time blocks and implement MI independently at each block. An alternative approach is to include all time blocks in the same MI model. With increasing numbers of time blocks, this approach is likely to break down because of co‐linearity and over‐fitting. The new two‐fold fully conditional specification (FCS) MI algorithm addresses these issues, by only conditioning on measurements, which are local in time. We describe and report the results of a novel simulation study to critically evaluate the two‐fold FCS algorithm and its suitability for imputation of longitudinal electronic health records. After generating a full data set, approximately 70% of selected continuous and categorical variables were made missing completely at random in each of ten time blocks. Subsequently, we applied a simple time‐to‐event model. We compared efficiency of estimated coefficients from a complete records analysis, MI of data in the baseline time block and the two‐fold FCS algorithm. The results show that the two‐fold FCS algorithm maximises the use of data available, with the gain relative to baseline MI depending on the strength of correlations within and between variables. Using this approach also increases plausibility of the missing at random assumption by using repeated measures over time of variables whose baseline values may be missing. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. 相似文献
2.
Longitudinal studies with repeated measures are often subject to non-response. Methods currently employed to alleviate the difficulties caused by missing data are typically unsatisfactory, especially when the cause of the missingness is related to the outcomes. We present an approach for incomplete categorical data in the repeated measures setting that allows missing data to depend on other observed outcomes for a study subject. The proposed methodology also allows a broader examination of study findings through interpretation of results in the framework of the set of all possible test statistics that might have been observed had no data been missing. The proposed approach consists of the following general steps. First, we generate all possible sets of missing values and form a set of possible complete data sets. We then weight each data set according to clearly defined assumptions and apply an appropriate statistical test procedure to each data set, combining the results to give an overall indication of significance. We make use of the EM algorithm and a Bayesian prior in this approach. While not restricted to the one-sample case, the proposed methodology is illustrated for one-sample data and compared to the common complete-case and available-case analysis methods. 相似文献
3.
《Value in health》2021,24(12):1720-1727
ObjectivesStudies face challenges with missing 5-level EQ-5D (EQ-5D-5L) data, often because of the need for longitudinal EQ-5D-5L data collection. There is a dearth of validated methodologies for dealing with missing EQ-5D-5L data in the literature. This study, for the first time, examined the possibility of using retrospectively collected EQ-5D-5L data as proxies for the missing data.MethodsParticipants who had prospectively completed a 3rd month postdischarge EQ-5D-5L instrument (in-the-moment collection) were randomly interviewed to respond to a 2nd “retrospective collection” of their 3rd month EQ-5D-5L at 6th, 9th, or 12th month after hospital discharge. A longitudinal single imputation was also used to assess the relative performance of retrospective collection compared with the longitudinal single imputation. Concordances between the in-the-moment, retrospective, and imputed measures were assessed using intraclass correlation coefficients and weighted kappa statistics.ResultsConsiderable agreement was observed on the basis of weighted kappa (range 0.72-0.95) between the mobility, self-care, and usual activities dimensions of EQ-5D-5L collected in-the-moment and retrospectively. Concordance based on intraclass correlation coefficients was good to excellent (range 0.79-0.81) for utility indices computed, and excellent (range 0.93-0.96) for quality-adjusted life-years computed using in-the-moment compared with retrospective EQ-5D-5L. The longitudinal single imputation did not perform as well as the retrospective collection method.ConclusionsThis study demonstrates that retrospective collection of EQ-5D-5L has high concordance with “in-the-moment” EQ-5D-5L and could be a valid and attractive alternative for data imputation when longitudinally collected EQ-5D-5L data are missing. Future studies examining this method for other disease areas and populations are required to provide more generalizable evidence. 相似文献
4.
The recent biostatistical literature contains a number of methods for handling the bias caused by 'informative censoring', which refers to drop-out from a longitudinal study after a number of visits scheduled at predetermined intervals. The same or related methods can be extended to situations where the missing pattern is intermittent. The pattern of missingness is often assumed to be related to the outcome through random effects which represent unmeasured individual characteristics such as health awareness. To date there is only limited experience with applying the methods for informative censoring in practice, mostly because of complicated modelling and difficult computations. In this paper, we propose an estimation method based on grouping the data. The proposed estimator is asymptotically unbiased in various situations under informative missingness. Several existing methods are reviewed and compared in simulation studies. We apply the methods to data from the Wisconsin Diabetes Registry Project, a longitudinal study tracking glycaemic control and acute and chronic complications from the diagnosis of type I diabetes. 相似文献
5.
Alejandro Salazar Begoña Ojeda María Dueñas Fernando Fernández Inmaculada Failde 《Statistics in medicine》2016,35(19):3424-3448
Missing data are a common problem in clinical and epidemiological research, especially in longitudinal studies. Despite many methodological advances in recent decades, many papers on clinical trials and epidemiological studies do not report using principled statistical methods to accommodate missing data or use ineffective or inappropriate techniques. Two refined techniques are presented here: generalized estimating equations (GEEs) and weighted generalized estimating equations (WGEEs). These techniques are an extension of generalized linear models to longitudinal or clustered data, where observations are no longer independent. They can appropriately handle missing data when the missingness is completely at random (GEE and WGEE) or at random (WGEE) and do not require the outcome to be normally distributed. Our aim is to describe and illustrate with a real example, in a simple and accessible way to researchers, these techniques for handling missing data in the context of longitudinal studies subject to dropout and show how to implement them in R. We apply them to assess the evolution of health‐related quality of life in coronary patients in a data set subject to dropout. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献
6.
Michelle Shardell Gregory E. Hicks Ram R. Miller Jay Magaziner 《Statistics in medicine》2010,29(22):2282-2296
We propose a semiparametric marginal modeling approach for longitudinal analysis of cohorts with data missing due to death and non‐response to estimate regression parameters interpreted as conditioned on being alive. Our proposed method accommodates outcomes and time‐dependent covariates that are missing not at random with non‐monotone missingness patterns via inverse‐probability weighting. Missing covariates are replaced by consistent estimates derived from a simultaneously solved inverse‐probability‐weighted estimating equation. Thus, we utilize data points with the observed outcomes and missing covariates beyond the estimated weights while avoiding numerical methods to integrate over missing covariates. The approach is applied to a cohort of elderly female hip fracture patients to estimate the prevalence of walking disability over time as a function of body composition, inflammation, and age. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献
7.
Random-coefficient pattern-mixture models (RCPMMs) have been proposed for longitudinal data when drop-out is thought to be non-ignorable. An RCPMM is a random-effects model with summaries of drop-out time included among the regressors. The basis of every RCPMM is extrapolation. We review RCPMMs, describe various extrapolation strategies, and show how analyses may be simplified through multiple imputation. Using simulated and real data, we show that alternative RCPMMs that fit equally well may lead to very different estimates for parameters of interest. We also show that minor model misspecification can introduce biases that are quite large relative to standard errors, even in fairly small samples. For many scientific applications, where the form of the population model and nature of the drop-out are unknown, interval estimates from any single RCPMM may suffer from undercoverage because uncertainty about model specification is not taken into account. 相似文献
8.
Missing data is a very common problem in medical and social studies, especially when data are collected longitudinally. It is a challenging problem to utilize observed data effectively. Many papers on missing data problems can be found in statistical literature. It is well known that the inverse weighted estimation is neither efficient nor robust. On the other hand, the doubly robust (DR) method can improve the efficiency and robustness. As is known, the DR estimation requires a missing data model (i.e., a model for the probability that data are observed) and a working regression model (i.e., a model for the outcome variable given covariates and surrogate variables). Because the DR estimating function has mean zero for any parameters in the working regression model when the missing data model is correctly specified, in this paper, we derive a formula for the estimator of the parameters of the working regression model that yields the optimally efficient estimator of the marginal mean model (the parameters of interest) when the missing data model is correctly specified. Furthermore, the proposed method also inherits the DR property. Simulation studies demonstrate the greater efficiency of the proposed method compared with the standard DR method. A longitudinal dementia data set is used for illustration. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献
9.
In longitudinal studies, missing observations occur commonly. It has been well known that biased results could be produced if missingness is not properly handled in the analysis. Authors have developed many methods with the focus on either incomplete response or missing covariate observations, but rarely on both. The complexity of modeling and computational difficulty would be the major challenges in handling missingness in both response and covariate variables. In this paper, we develop methods using the pairwise likelihood formulation to handle longitudinal binary data with missing observations present in both response and covariate variables. We propose a unified framework to accommodate various types of missing data patterns. We evaluate the performance of the methods empirically under a variety of circumstances. In particular, we investigate issues on efficiency and robustness. We analyze longitudinal data from the National Population Health Study with the use of our methods. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献
10.
ObjectivesIn trial-based economic evaluation, some individuals are typically associated with missing data at some time point, so that their corresponding aggregated outcomes (eg, quality-adjusted life-years) cannot be evaluated. Restricting the analysis to the complete cases is inefficient and can result in biased estimates, while imputation methods are often implemented under a missing at random (MAR) assumption. We propose the use of joint longitudinal models to extend standard approaches by taking into account the longitudinal structure to improve the estimation of the targeted quantities under MAR.MethodsWe compare the results from methods that handle missingness at an aggregated (case deletion, baseline imputation, and joint aggregated models) and disaggregated (joint longitudinal models) level under MAR. The methods are compared using a simulation study and applied to data from 2 real case studies.ResultsSimulations show that, according to which data affect the missingness process, aggregated methods may lead to biased results, while joint longitudinal models lead to valid inferences under MAR. The analysis of the 2 case studies support these results as both parameter estimates and cost-effectiveness results vary based on the amount of data incorporated into the model.ConclusionsOur analyses suggest that methods implemented at the aggregated level are potentially biased under MAR as they ignore the information from the partially observed follow-up data. This limitation can be overcome by extending the analysis to a longitudinal framework using joint models, which can incorporate all the available evidence. 相似文献
11.
In longitudinal studies with potentially nonignorable drop-out, one can assess the likely effect of the nonignorability in a sensitivity analysis. Troxel et al. proposed a general index of sensitivity to nonignorability, or ISNI, to measure sensitivity of key inferences in a neighbourhood of the ignorable, missing at random (MAR) model. They derived detailed formulas for ISNI in the special case of the generalized linear model with a potentially missing univariate outcome. In this paper, we extend the method to longitudinal modelling. We use a multivariate normal model for the outcomes and a regression model for the drop-out process, allowing missingness probabilities to depend on an unobserved response. The computation is straightforward, and merely involves estimating a mixed-effects model and a selection model for the drop-out, together with some simple arithmetic calculations. We illustrate the method with three examples. 相似文献
12.
Longitudinal observational studies provide rich opportunities to examine treatment effectiveness during the course of a chronic illness. However, there are threats to the validity of observational inferences. For instance, clinician judgment and self‐selection play key roles in treatment assignment. To account for this, an adjustment such as the propensity score can be used if certain assumptions are fulfilled. Here, we consider a problem that could surface in a longitudinal observational study and has been largely overlooked. It can occur when subjects have a varying number of distinct periods of therapeutic intervention. We evaluate the implications of baseline variables in the propensity model being associated with the number of post baseline observations per subject and refer to it as ‘covariate‐dependent representation’. An observational study of antidepressant treatment effectiveness serves as a motivating example. The analyses examine the first 20 years of follow‐up data from the National Institute of Mental Health Collaborative Depression Study, a longitudinal, observational study. A simulation study evaluates the consequences of covariate‐dependent representation in longitudinal observational studies of treatment effectiveness under a range of data specifications.The simulations found that estimates were adversely affected by underrepresentation when there was lower ICC among repeated doses and among repeated outcomes. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献
13.
Kaifeng Lu 《Statistics in medicine》2015,34(5):782-795
Pattern‐mixture models provide a general and flexible framework for sensitivity analyses of nonignorable missing data in longitudinal studies. The delta‐adjusted pattern‐mixture models handle missing data in a clinically interpretable manner and have been used as sensitivity analyses addressing the effectiveness hypothesis, while a likelihood‐based approach that assumes data are missing at random is often used as the primary analysis addressing the efficacy hypothesis. We describe a method for power calculations for delta‐adjusted pattern‐mixture model sensitivity analyses in confirmatory clinical trials. To apply the method, we only need to specify the pattern probabilities at postbaseline time points, the expected treatment differences at postbaseline time points, the conditional covariance matrix of postbaseline measurements given the baseline measurement, and the delta‐adjustment method for the pattern‐mixture model. We use an example to illustrate and compare various delta‐adjusted pattern‐mixture models and use simulations to confirm the analytic results. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献
14.
Longitudinal cohort studies often collect both repeated measurements of longitudinal outcomes and times to clinical events whose occurrence precludes further longitudinal measurements. Although joint modeling of the clinical events and the longitudinal data can be used to provide valid statistical inference for target estimands in certain contexts, the application of joint models in medical literature is currently rather restricted because of the complexity of the joint models and the intensive computation involved. We propose a multiple imputation approach to jointly impute missing data of both the longitudinal and clinical event outcomes. With complete imputed datasets, analysts are then able to use simple and transparent statistical methods and standard statistical software to perform various analyses without dealing with the complications of missing data and joint modeling. We show that the proposed multiple imputation approach is flexible and easy to implement in practice. Numerical results are also provided to demonstrate its performance. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献
15.
We propose a marginal modeling approach to estimate the association between a time-dependent covariate and an outcome in longitudinal studies where some study participants die during follow-up and both variables have non-monotone response patterns. The proposed method is an extension of weighted estimating equations that allows the outcome and covariate to have different missing-data patterns. We present methods for both random and non-random missing-data mechanisms. A study of functional recovery in a cohort of elderly female hip-fracture patients motivates the approach. 相似文献
16.
Marshall G De la Cruz-Mesía R Barón AE Rutledge JH Zerbe GO 《Statistics in medicine》2006,25(16):2817-2830
The use of random-effects models for the analysis of longitudinal data with missing responses has been discussed by several authors. In this paper, we extend the non-linear random-effects model for a single response to the case of multiple responses, allowing for arbitrary patterns of observed and missing data. Parameters for this model are estimated via the EM algorithm and by the first-order approximation available in SAS Proc NLMIXED. The set of equations for this estimation procedure is derived and these are appropriately modified to deal with missing data. The methodology is illustrated with an example using data coming from a study involving 161 pregnant women presenting to a private obstetrics clinic in Santiago, Chile. 相似文献
17.
The analysis of quality of life (QoL) data can be challenging due to the skewness of responses and the presence of missing data. In this paper, we propose a new weighted quantile regression method for estimating the conditional quantiles of QoL data with responses missing at random. The proposed method makes use of the correlation information within the same subject from an auxiliary mean regression model to enhance the estimation efficiency and takes into account of missing data mechanism. The asymptotic properties of the proposed estimator have been studied and simulations are also conducted to evaluate the performance of the proposed estimator. The proposed method has also been applied to the analysis of the QoL data from a clinical trial on early breast cancer, which motivated this study. 相似文献
18.
Wan-Lun Wang 《Statistics in medicine》2020,39(19):2518-2535
Multivariate longitudinal data usually exhibit complex features such as the presence of censored responses due to detection limits of the assay and unavoidable missing values arising when participants make irregular visits that lead to intermittently recorded characteristics. A generalization of the multivariate linear mixed model constructed by taking into account impacts of censored and intermittent missing responses simultaneously, which is named as the MLMM-CM, has been recently proposed for more precisely analyzing such kinds of data. This paper aims at presenting a fully Bayesian sampling-based approach to the MLMM-CM for addressing the uncertainties of censored and missing responses as well as unknown parameters. Two widely accepted Bayesian computational techniques based on the Markov chain Monte Carlo and the inverse Bayes formulas coupled with the Gibbs (IBF-Gibbs) schemes are developed for carrying out posterior inference of the model. The proposed methodology is illustrated through a simulation study and a real-data example from the Adult AIDS Clinical Trials Group 388 study. Numerical results show empirically that the proposed Bayesian methodology performs satisfactorily and offers reliable posterior inference. 相似文献
19.
Modelling the relationship between pulmonary function and survival in cystic fibrosis (CF) is complicated by the fact that measures of pulmonary function commonly used such as the forced expiratory volume in one second (FEV(1)) are measured with error, and patients with the poorest lung function are increasingly censored by death, that is, data are available only for the patients who have survived to the current age. We assume a linear random effects model for FEV1 per cent predicted, where the random intercept and slope of FEV(1) per cent predicted, along with a specified transformation of the age at death follow a trivariate normal distribution. We illustrate how this model can be used to describe the relationship between age at death and parameters of the individual patient's regression of FEV(1) per cent predicted versus age, such as the slope and the intercept or true value of FEV(1) per cent predicted at a given age. We also illustrate how the model provides empirical Bayes estimates of these individual parameters. In particular, we explore how the predicted value of the age at death might be used as a prognostic or severity index. The model and methods are illustrated on a cohort of 188 cystic fibrosis patients with a common genotype (homozygous for the DeltaF508 mutation), born on or after 1965 and followed at the CF Center at the Rainbow Babies and Children's Hospital, Cleveland, OH, U.S.A. 相似文献
20.
New quasi-imputation and expansion strategies for correlated binary responses are proposed by borrowing ideas from random number generation. The core idea is to convert correlated binary outcomes to multivariate normal outcomes in a sensible way so that re-conversion to the binary scale, after performing multiple imputation, yields the original specified marginal expectations and correlations. This conversion process ensures that the correlations are transformed reasonably which in turn allows us to take advantage of well-developed imputation techniques for Gaussian outcomes. We use the phrase 'quasi' because the original observations are not guaranteed to be preserved. We argue that if the inferential goals are well-defined, it is not necessary to strictly adhere to the established definition of multiple imputation. Our expansion scheme employs a similar strategy where imputation is used as an intermediate step. It leads to proportionally inflated observed patterns, forcing the data set to a complete rectangular format. The plausibility of the proposed methodology is examined by applying it to a wide range of simulated data sets that reflect alternative assumptions on complete data populations and missing-data mechanisms. We also present an application using a data set from obesity research. We conclude that the proposed method is a promising tool for handling incomplete longitudinal or clustered binary outcomes under ignorable non-response mechanisms. 相似文献