首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We propose a non-parametric multiple imputation scheme, NPMLE imputation, for the analysis of interval censored survival data. Features of the method are that it converts interval-censored data problems to complete data or right censored data problems to which many standard approaches can be used, and that measures of uncertainty are easily obtained. In addition to the event time of primary interest, there are frequently other auxiliary variables that are associated with the event time. For the goal of estimating the marginal survival distribution, these auxiliary variables may provide some additional information about the event time for the interval censored observations. We extend the imputation methods to incorporate information from auxiliary variables with potentially complex structures. To conduct the imputation, we use a working failure-time proportional hazards model to define an imputing risk set for each censored observation. The imputation schemes consist of using the data in the imputing risk sets to create an exact event time for each interval censored observation. In simulation studies we show that the use of multiple imputation methods can improve the efficiency of estimators and reduce the effect of missing visits when compared to simpler approaches. We apply the approach to cytomegalovirus shedding data from an AIDS clinical trial, in which CD4 count is the auxiliary variable.  相似文献   

2.
We develop an approach, based on multiple imputation, that estimates the marginal survival distribution in survival analysis using auxiliary variables to recover information for censored observations. To conduct the imputation, we use two working survival models to define a nearest neighbour imputing risk set. One model is for the event times and the other for the censoring times. Based on the imputing risk set, two non-parametric multiple imputation methods are considered: risk set imputation, and Kaplan-Meier imputation. For both methods a future event or censoring time is imputed for each censored observation. With a categorical auxiliary variable, we show that with a large number of imputes the estimates from the Kaplan-Meier imputation method correspond to the weighted Kaplan-Meier estimator. We also show that the Kaplan-Meier imputation method is robust to mis-specification of either one of the two working models. In a simulation study with time independent and time-dependent auxiliary variables, we compare the multiple imputation approaches with an inverse probability of censoring weighted method. We show that all approaches can reduce bias due to dependent censoring and improve the efficiency. We apply the approaches to AIDS clinical trial data comparing ZDV and placebo, in which CD4 count is the time-dependent auxiliary variable.  相似文献   

3.
Multiple imputation is commonly used to impute missing data, and is typically more efficient than complete cases analysis in regression analysis when covariates have missing values. Imputation may be performed using a regression model for the incomplete covariates on other covariates and, importantly, on the outcome. With a survival outcome, it is a common practice to use the event indicator D and the log of the observed event or censoring time T in the imputation model, but the rationale is not clear. We assume that the survival outcome follows a proportional hazards model given covariates X and Z. We show that a suitable model for imputing binary or Normal X is a logistic or linear regression on the event indicator D, the cumulative baseline hazard H0(T), and the other covariates Z. This result is exact in the case of a single binary covariate; in other cases, it is approximately valid for small covariate effects and/or small cumulative incidence. If we do not know H0(T), we approximate it by the Nelson–Aalen estimator of H(T) or estimate it by Cox regression. We compare the methods using simulation studies. We find that using logT biases covariate‐outcome associations towards the null, while the new methods have lower bias. Overall, we recommend including the event indicator and the Nelson–Aalen estimator of H(T) in the imputation model. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

4.
Most multiple imputation (MI) methods for censored survival data either ignore patient characteristics when imputing a likely event time, or place quite restrictive modeling assumptions on the survival distributions used for imputation. In this research, we propose a robust MI approach that directly imputes restricted lifetimes over the study period based on a model of the mean restricted life as a linear function of covariates. This method has the advantages of retaining patient characteristics when making imputation choices through the restricted mean parameters and does not make assumptions on the shapes of hazards or survival functions. Simulation results show that our method outperforms its closest competitor for modeling restricted mean lifetimes in terms of bias and efficiency in both independent censoring and dependent censoring scenarios. Survival estimates of restricted lifetime model parameters and marginal survival estimates regain much of the precision lost due to censoring. The proposed method is also much less subject to dependent censoring bias captured by covariates in the restricted mean model. This particular feature is observed in a full statistical analysis conducted in the context of the International Breast Cancer Study Group Ludwig Trial V using the proposed methodology.  相似文献   

5.
When the event time of interest depends on the censoring time, conventional two-sample test methods, such as the log-rank and Wilcoxon tests, can produce an invalid test result. We extend our previous work on estimation using auxiliary variables to adjust for dependent censoring via multiple imputation, to the comparison of two survival distributions. To conduct the imputation, we use two working models to define a set of similar observations called the imputing risk set. One model is for the event times and the other for the censoring times. Based on the imputing risk set, a nonparametric multiple imputation method, Kaplan-Meier imputation, is used to impute a future event or censoring time for each censored observation. After imputation, the conventional nonparametric two-sample tests can be easily implemented on the augmented data sets. Simulation studies show that the sizes of the log-rank and Wilcoxon tests constructed on the imputed data sets are comparable to the nominal level and the powers are much higher compared with the tests based on the unimputed data in the presence of dependent censoring if either one of the two working models is correctly specified. The method is illustrated using AIDS clinical trial data comparing ZDV and placebo, in which CD4 count is the time-dependent auxiliary variable.  相似文献   

6.
Imputation strategies are widely used in settings that involve inference with incomplete data. However, implementation of a particular approach always rests on assumptions, and subtle distinctions between methods can have an impact on subsequent analyses. In this research article, we are concerned with regression models in which the true underlying relationship includes interaction terms. We focus in particular on a linear model with one fully observed continuous predictor, a second partially observed continuous predictor, and their interaction. We derive the conditional distribution of the missing covariate and interaction term given the observed covariate and the outcome variable, and examine the performance of a multiple imputation procedure based on this distribution. We also investigate several alternative procedures that can be implemented by adapting multivariate normal multiple imputation software in ways that might be expected to perform well despite incompatibilities between model assumptions and true underlying relationships among the variables. The methods are compared in terms of bias, coverage, and CI width. As expected, the procedure based on the correct conditional distribution performs well across all scenarios. Just as importantly for general practitioners, several of the approaches based on multivariate normality perform comparably with the correct conditional distribution in a number of circumstances, although interestingly, procedures that seek to preserve the multiplicative relationship between the interaction term and the main‐effects are found to be substantially less reliable. For illustration, the various procedures are applied to an analysis of post‐traumatic stress disorder symptoms in a study of childhood trauma. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

7.
Several approaches exist for handling missing covariates in the Cox proportional hazards model. The multiple imputation (MI) is relatively easy to implement with various software available and results in consistent estimates if the imputation model is correct. On the other hand, the fully augmented weighted estimators (FAWEs) recover a substantial proportion of the efficiency and have the doubly robust property. In this paper, we compare the FAWEs and the MI through a comprehensive simulation study. For the MI, we consider the multiple imputation by chained equation and focus on two imputation methods: Bayesian linear regression imputation and predictive mean matching. Simulation results show that the imputation methods can be rather sensitive to model misspecification and may have large bias when the censoring time depends on the missing covariates. In contrast, the FAWEs allow the censoring time to depend on the missing covariates and are remarkably robust as long as getting either the conditional expectations or the selection probability correct due to the doubly robust property. The comparison suggests that the FAWEs show the potential for being a competitive and attractive tool for tackling the analysis of survival data with missing covariates. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

8.
In cancer trials, a significant fraction of patients can be cured, that is, the disease is completely eliminated, so that it never recurs. In general, treatments are developed to both increase the patients' chances of being cured and prolong the survival time among non-cured patients. A cure rate model represents a combination of cure fraction and survival model, and can be applied to many clinical studies over several types of cancer. In this article, the cure rate model is considered in the interval censored data composed of two time points, which include the event time of interest. Interval censored data commonly occur in the studies of diseases that often progress without symptoms, requiring clinical evaluation for detection (Encyclopedia of Biostatistics. Wiley: New York, 1998; 2090-2095). In our study, an approximate likelihood approach suggested by Goetghebeur and Ryan (Biometrics 2000; 56:1139-1144) is used to derive the likelihood in interval censored data. In addition, a frailty model is introduced to characterize the association between the cure fraction and survival model. In particular, the positive association between the cure fraction and the survival time is incorporated by imposing a common normal frailty effect. The EM algorithm is used to estimate parameters and a multiple imputation based on the profile likelihood is adopted for variance estimation. The approach is applied to the smoking cessation study in which the event of interest is a smoking relapse and several covariates including an intensive care treatment are evaluated to be effective for both the occurrence of relapse and the non-smoking duration.  相似文献   

9.
In natural history studies of human immunodeficiency virus type 1 (HIV-1) infection a substantial proportion of participants are seropositive at time of enrollment in the study. These participants form a prevalent subcohort. Estimation of the unknown times since exposure to HIV-1 in the prevalent subcohort is of primary importance for estimation of the incubation time of AIDS. The subset of the cohort that tested negative for antibody to HIV-1 at study entry and was observed to seroconvert forms the incident subcohort that provides longitudinal data on markers of maturity (that is, duration) of infection. We use parametric life table regression models incorporating truncation to describe the conditional distribution (imputing model) of the times since seroconversion given a vector of the markers of maturity. Using the fitted model and the values of the markers of maturity of infection provided by the seroprevalent subcohort at entry into the study, we can impute the unknown times since seroconversion for the prevalent subcohort. We implement multiple imputation based on a model-robust estimate of the covariance matrix of parameters of the imputing model to provide confidence intervals for the geometric mean of the time since seroconversion in the prevalent subcohort, and to compare maturity of infection of cohorts recruited in different cities. The accuracy of imputation is further validated by comparisons of imputation-based estimates of AIDS incubation distribution in the seroprevalent subcohort with more direct estimates obtained from the seroincident subcohort.  相似文献   

10.
Multiple imputation is commonly used to impute missing covariate in Cox semiparametric regression setting. It is to fill each missing data with more plausible values, via a Gibbs sampling procedure, specifying an imputation model for each missing variable. This imputation method is implemented in several softwares that offer imputation models steered by the shape of the variable to be imputed, but all these imputation models make an assumption of linearity on covariates effect. However, this assumption is not often verified in practice as the covariates can have a nonlinear effect. Such a linear assumption can lead to a misleading conclusion because imputation model should be constructed to reflect the true distributional relationship between the missing values and the observed values. To estimate nonlinear effects of continuous time invariant covariates in imputation model, we propose a method based on B‐splines function. To assess the performance of this method, we conducted a simulation study, where we compared the multiple imputation method using Bayesian splines imputation model with multiple imputation using Bayesian linear imputation model in survival analysis setting. We evaluated the proposed method on the motivated data set collected in HIV‐infected patients enrolled in an observational cohort study in Senegal, which contains several incomplete variables. We found that our method performs well to estimate hazard ratio compared with the linear imputation methods, when data are missing completely at random, or missing at random. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

11.
Peng Y  Zhang J 《Statistics in medicine》2008,27(25):5177-5194
Mixture cure frailty model has been proposed to analyze censored survival data with a cured fraction and unobservable information among the uncured patients. Different from a usual mixture cure model, the frailty model is employed to model the latency component in the mixture cure frailty model. In this paper, we extend the mixture cure frailty model by incorporating covariates into both the cure rate and the latency distribution parts of the model and propose a semiparametric estimation method for the model. The Expectation Maximization (EM) algorithm and the multiple imputation method are employed to estimate parameters of interest. In the simulation study, we show that both estimation methods work well. To illustrate, we apply the model and the proposed methods to a data set of failure times from bone marrow transplant patients.  相似文献   

12.
We consider the situation of estimating the marginal survival distribution from censored data subject to dependent censoring using auxiliary variables. We had previously developed a nonparametric multiple imputation approach. The method used two working proportional hazards (PH) models, one for the event times and the other for the censoring times, to define a nearest neighbor imputing risk set. This risk set was then used to impute failure times for censored observations. Here, we adapt the method to the situation where the event and censoring times follow accelerated failure time models and propose to use the Buckley–James estimator as the two working models. Besides studying the performances of the proposed method, we also compare the proposed method with two popular methods for handling dependent censoring through the use of auxiliary variables, inverse probability of censoring weighted and parametric multiple imputation methods, to shed light on the use of them. In a simulation study with time‐independent auxiliary variables, we show that all approaches can reduce bias due to dependent censoring. The proposed method is robust to misspecification of either one of the two working models and their link function. This indicates that a working proportional hazards model is preferred because it is more cumbersome to fit an accelerated failure time model. In contrast, the inverse probability of censoring weighted method is not robust to misspecification of the link function of the censoring time model. The parametric imputation methods rely on the specification of the event time model. The approaches are applied to a prostate cancer dataset. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

13.
Multiple imputation (MI) is becoming increasingly popular for handling missing data. Standard approaches for MI assume normality for continuous variables (conditionally on the other variables in the imputation model). However, it is unclear how to impute non‐normally distributed continuous variables. Using simulation and a case study, we compared various transformations applied prior to imputation, including a novel non‐parametric transformation, to imputation on the raw scale and using predictive mean matching (PMM) when imputing non‐normal data. We generated data from a range of non‐normal distributions, and set 50% to missing completely at random or missing at random. We then imputed missing values on the raw scale, following a zero‐skewness log, Box–Cox or non‐parametric transformation and using PMM with both type 1 and 2 matching. We compared inferences regarding the marginal mean of the incomplete variable and the association with a fully observed outcome. We also compared results from these approaches in the analysis of depression and anxiety symptoms in parents of very preterm compared with term‐born infants. The results provide novel empirical evidence that the decision regarding how to impute a non‐normal variable should be based on the nature of the relationship between the variables of interest. If the relationship is linear in the untransformed scale, transformation can introduce bias irrespective of the transformation used. However, if the relationship is non‐linear, it may be important to transform the variable to accurately capture this relationship. A useful alternative is to impute the variable using PMM with type 1 matching. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

14.
The treatment of missing data in comparative effectiveness studies with right-censored outcomes and time-varying covariates is challenging because of the multilevel structure of the data. In particular, the performance of an accessible method like multiple imputation (MI) under an imputation model that ignores the multilevel structure is unknown and has not been compared to complete-case (CC) and single imputation methods that are most commonly applied in this context. Through an extensive simulation study, we compared statistical properties among CC analysis, last value carried forward, mean imputation, the use of missing indicators, and MI-based approaches with and without auxiliary variables under an extended Cox model when the interest lies in characterizing relationships between non-missing time-varying exposures and right-censored outcomes. MI demonstrated favorable properties under a moderate missing-at-random condition (absolute bias <0.1) and outperformed CC and single imputation methods, even when the MI method did not account for correlated observations in the imputation model. The performance of MI decreased with increasing complexity such as when the missing data mechanism involved the exposure of interest, but was still preferred over other methods considered and performed well in the presence of strong auxiliary variables. We recommend considering MI that ignores the multilevel structure in the imputation model when data are missing in a time-varying confounder, incorporating variables associated with missingness in the MI models as well as conducting sensitivity analyses across plausible assumptions.  相似文献   

15.
A semi-parametric accelerated failure time cure model   总被引:1,自引:0,他引:1  
Li CS  Taylor JM 《Statistics in medicine》2002,21(21):3235-3247
A cure model is a useful approach for analysing failure time data in which some subjects could eventually experience, and others never experience, the event of interest. A cure model has two components: incidence which indicates whether the event could eventually occur and latency which denotes when the event will occur given the subject is susceptible to the event. In this paper, we propose a semi-parametric cure model in which covariates can affect both the incidence and the latency. A logistic regression model is proposed for the incidence, and the latency is determined by an accelerated failure time regression model with unspecified error distribution. An EM algorithm is developed to fit the model. The procedure is applied to a data set of tonsil cancer patients treated with radiation therapy.  相似文献   

16.
Recently, multiple imputation has been proposed as a tool for individual patient data meta‐analysis with sporadically missing observations, and it has been suggested that within‐study imputation is usually preferable. However, such within study imputation cannot handle variables that are completely missing within studies. Further, if some of the contributing studies are relatively small, it may be appropriate to share information across studies when imputing. In this paper, we develop and evaluate a joint modelling approach to multiple imputation of individual patient data in meta‐analysis, with an across‐study probability distribution for the study specific covariance matrices. This retains the flexibility to allow for between‐study heterogeneity when imputing while allowing (i) sharing information on the covariance matrix across studies when this is appropriate, and (ii) imputing variables that are wholly missing from studies. Simulation results show both equivalent performance to the within‐study imputation approach where this is valid, and good results in more general, practically relevant, scenarios with studies of very different sizes, non‐negligible between‐study heterogeneity and wholly missing variables. We illustrate our approach using data from an individual patient data meta‐analysis of hypertension trials. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.  相似文献   

17.
Multivariate interval‐censored failure time data arise commonly in many studies of epidemiology and biomedicine. Analysis of these type of data is more challenging than the right‐censored data. We propose a simple multiple imputation strategy to recover the order of occurrences based on the interval‐censored event times using a conditional predictive distribution function derived from a parametric gamma random effects model. By imputing the interval‐censored failure times, the estimation of the regression and dependence parameters in the context of a gamma frailty proportional hazards model using the well‐developed EM algorithm is made possible. A robust estimator for the covariance matrix is suggested to adjust for the possible misspecification of the parametric baseline hazard function. The finite sample properties of the proposed method are investigated via simulation. The performance of the proposed method is highly satisfactory, whereas the computation burden is minimal. The proposed method is also applied to the diabetic retinopathy study (DRS) data for illustration purpose and the estimates are compared with those based on other existing methods for bivariate grouped survival data. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

18.
Genome‐wide association studies are usually accompanied by imputation techniques to complement genome‐wide SNP chip genotypes. Current imputation approaches separate the phasing of study data from imputing, which makes the phasing independent from the reference data. The two‐step approach allows for updating the imputation for a new reference panel without repeating the tedious phasing step. This advantage, however, does no longer hold, when the build of the study data differs from the build of the reference data. In this case, the current approach is to harmonize the study data annotation with the reference data (prephasing lift‐over), requiring rephasing and re‐imputing. As a novel approach, we propose to harmonize study haplotypes with reference haplotypes (postphasing lift‐over). This allows for updating imputed study data for new reference panels without requiring rephasing. With continuously updated reference panels, our approach can save considerable computing time of up to 1 month per re‐imputation. We evaluated the rephasing and postphasing lift‐over approaches by using data from 1,644 unrelated individuals imputed by both approaches and comparing it with directly typed genotypes. On average, both approaches perform equally well with mean concordances of 93% between imputed and typed genotypes for both approaches. Also, imputation qualities are similar (mean difference in RSQ < 0.1%). We demonstrate that our novel postphasing lift‐over approach is a practical and time‐saving alternative to the prephasing lift‐over. This might encourage study partners to accommodate updated reference builds and ultimately improve the information content of study data. Our novel approach is implemented in the software PhaseLift.  相似文献   

19.
Lam KF  Fong DY  Tang OY 《Statistics in medicine》2005,24(12):1865-1879
There has been a recurring interest in modelling survival data which hypothesize subpopulations of individuals highly susceptible to some types of adverse events while other individuals are assumed to be at much less risk, like recurrence of breast cancer. A binary random effect is assumed in this article to model the susceptibility of each individual. We propose a simple multiple imputation algorithm for the analysis of censored data which combines a binary regression formulation for the probability of occurrence of an event, say recurrence of the breast cancer tumour, and a Cox's proportional hazards regression model for the time to occurrence of the event if it does. The model distinguishes the effects of the covariates on the probability of cure and on the time to recurrence of the disease. A SAS macro has been written to implement the proposed multiple imputation algorithm so that sophisticated programming effort can be rendered into a user-friendly application. Simulation results show that the estimates are reasonably efficient. The method is applied to analyse the breast cancer recurrence data. The proposed method can be modified easily to accommodate more general random effects other than the binary random effects so that the random effects not only affect the probability of occurrence of the event, but also the heterogeneity of the time to recurrence of the event among the uncured patients.  相似文献   

20.
We consider weighted logrank tests for interval censored data when assessment times may depend on treatment, and for each individual, we only use the two assessment times that bracket the event of interest. It is known that treating finite right endpoints as observed events can substantially inflate the type I error rate under assessment–treatment dependence (ATD), but the validity of several other implementations of weighted logrank tests (score tests, permutation tests, multiple imputation tests) has not been studied in this situation. With a bounded number of unique assessment times, the score test under the grouped continuous model retains the type I error rate asymptotically under ATD; however, although the approximate permutation test based on the permutation central limit theorem is not asymptotically valid under every ATD scenario, we show through simulation that in many ATD scenarios, it retains the type I error rate better than the score test. We show a case where the approximate permutation test retains the type I error rate when the exact permutation test does not. We study and modify the multiple imputation logrank tests of Huang, Lee, and Yu (2008, Statistics in Medicine, 27: 3217–3226), showing that the distribution of the rank‐like scores asymptotically does not depend on the assessment times. We show through simulations that our modifications of the multiple imputation logrank tests retain the type I error rate in all cases studied, even with ATD and a small number of individuals in each treatment group. Simulations were performed using the interval R package. Published 2012. This article is a US Government work and is in the public domain in the USA.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号