首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
H Wu  L Wu 《Statistics in medicine》2001,20(12):1755-1769
We propose a three-step multiple imputation method, implemented by Gibbs sampler, for estimating parameters in non-linear mixed-effects models with missing covariates. Estimates obtained by the proposed multiple imputation method are compared to those obtained by the mean-value imputation method and the complete-case method through simulations. We find that the proposed multiple imputation method offers smaller biases and smaller mean-squared errors for the estimates of covariate coefficients compared to other two methods. We apply the three missing data methods to modelling HIV viral dynamics from an AIDS clinical trial. We believe that the results from the proposed multiple imputation method are more reliable than that from the other two commonly used methods.  相似文献   

2.
Wu L 《Statistics in medicine》2004,23(11):1715-1731
In AIDS studies such as HIV viral dynamics, statistical inference is often complicated because the viral load measurements may be subject to left censoring due to a detection limit and time-varying covariates such as CD4 counts may be measured with substantial errors. Mixed-effects models are often used to model the response and the covariate processes in these studies. We propose a unified approach which addresses the censoring and measurement errors simultaneously. We estimate the model parameters by a Monte-Carlo EM algorithm via the Gibbs sampler. A simulation study is conducted to compare the proposed method with the usual two-step method and a naive method. We find that the proposed method produces approximately unbiased estimates with more reliable standard errors. A real data set from an AIDS study is analysed using the proposed method.  相似文献   

3.
When modeling longitudinal data, the true values of time‐varying covariates may be unknown because of detection‐limit censoring or measurement error. A common approach in the literature is to empirically model the covariate process based on observed data and then predict the censored values or mismeasured values based on this empirical model. Such an empirical model can be misleading, especially for censored values since the (unobserved) censored values may behave very differently than observed values due to the underlying data‐generation mechanisms or disease status. In this paper, we propose a mechanistic nonlinear covariate model based on the underlying data‐generation mechanisms to address censored values and mismeasured values. Such a mechanistic model is based on solid scientific or biological arguments, so the predicted censored or mismeasured values are more reasonable. We use a Monte Carlo EM algorithm for likelihood inference and apply the methods to an AIDS dataset, where viral load is censored by a lower detection limit. Simulation results confirm that the proposed models and methods offer substantial advantages over existing empirical covariate models for censored and mismeasured covariates.  相似文献   

4.
Wu L 《Statistics in medicine》2007,26(17):3342-3357
In recent years HIV viral dynamic models have received great attention in AIDS studies. Often, subjects in these studies may drop out for various reasons such as drug intolerance or drug resistance, and covariates may also contain missing data. Statistical analyses ignoring informative dropouts and missing covariates may lead to misleading results. We consider appropriate methods for HIV viral dynamic models with informative dropouts and missing covariates and evaluate these methods via simulations. A real data set is analysed, and the results show that the initial viral decay rate, which may reflect the efficacy of the anti-HIV treatment, may be over-estimated if dropout patients are ignored. We also find that the current or immediate previous viral load values may be most predictive for patients' dropout. These results may be important for HIV/AIDS studies.  相似文献   

5.
A method for reconstructing the HIV infection curve from data on both HIV and AIDS diagnoses is enhanced by using age as a covariate and by using the diagnosis data to estimate parameters that were previously assumed known. Maximum likelihood estimation is used for parameters of the induction distribution. Each of the set of parameters that specify the baseline rate of infection over time and the set of parameters giving the relative susceptibility over age are estimated by maximizing the likelihood subject to a smoothness requirement. We find that estimating the extra parameters is feasible, producing estimates with good precision. Including age as a covariate gives 90 per cent confidence intervals for the HIV incidence curve that are about 20 per cent narrower than those obtained when age data are not used.  相似文献   

6.
Methods of estimation and inference about survival distributions based on length-biased samples are well-established. Comparatively little attention has been given to the assessment of covariate effects in the context of length-biased samples, but prevalent cohort studies often have this objective. We show that, like the survival distribution, the covariate distribution from a prevalent cohort study is length-biased, and that this distribution may contain parametric information about covariate effects on the survival time. As a result, a likelihood based on the joint distribution of the survival time and the covariates yields estimates of covariate effects which are at least as efficient as estimates arising from a traditional likelihood which conditions on covariate values in the length-biased sample. We also investigate the empirical bias of estimators arising from a joint likelihood when the population covariate distribution is misspecified. The asymptotic relative efficiencies and empirical biases under model misspecification are assessed for both proportional hazards and accelerated failure time models. The various methods considered are applied in an illustrative analysis of risk factors for death following onset of dementia using data collected in the Canadian Study of Health and Aging.  相似文献   

7.
It is of interest to estimate the distribution of usual nutrient intake for a population from repeat 24‐h dietary recall assessments. A mixed effects model and quantile estimation procedure, developed at the National Cancer Institute (NCI), may be used for this purpose. The model incorporates a Box–Cox parameter and covariates to estimate usual daily intake of nutrients; model parameters are estimated via quasi‐Newton optimization of a likelihood approximated by the adaptive Gaussian quadrature. The parameter estimates are used in a Monte Carlo approach to generate empirical quantiles; standard errors are estimated by bootstrap. The NCI method is illustrated and compared with current estimation methods, including the individual mean and the semi‐parametric method developed at the Iowa State University (ISU), using data from a random sample and computer simulations. Both the NCI and ISU methods for nutrients are superior to the distribution of individual means. For simple (no covariate) models, quantile estimates are similar between the NCI and ISU methods. The bootstrap approach used by the NCI method to estimate standard errors of quantiles appears preferable to Taylor linearization. One major advantage of the NCI method is its ability to provide estimates for subpopulations through the incorporation of covariates into the model. The NCI method may be used for estimating the distribution of usual nutrient intake for populations and subpopulations as part of a unified framework of estimation of usual intake of dietary constituents. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

8.
When applying survival analysis, such as Cox regression, to data from major clinical trials or other studies, often only baseline covariates are used. This is typically the case even if updated covariates are available throughout the observation period, which leaves large amounts of information unused. The main reason for this is that such time‐dependent covariates often are internal to the disease process, as they are influenced by treatment, and therefore lead to confounded estimates of the treatment effect. There are, however, methods to exploit such covariate information in a useful way. We study the method of dynamic path analysis applied to data from the Swiss HIV Cohort Study. To adjust for time‐dependent confounding between treatment and the outcome ‘AIDS or death’, we carried out the analysis on a sequence of mimicked randomized trials constructed from the original cohort data. To analyze these trials together, regular dynamic path analysis is extended to a composite analysis of weighted dynamic path models. Results using a simple path model, with one indirect effect mediated through current HIV‐1 RNA level, show that most or all of the total effect go through HIV‐1 RNA for the first 4 years. A similar model, but with CD4 level as mediating variable, shows a weaker indirect effect, but the results are in the same direction. There are many reasons to be cautious when drawing conclusions from estimates of direct and indirect effects. Dynamic path analysis is however a useful tool to explore underlying processes, which are ignored in regular analyses. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

9.
In this paper we consider longitudinal studies in which the outcome to be measured over time is binary, and the covariates of interest are categorical. In longitudinal studies it is common for the outcomes and any time-varying covariates to be missing due to missed study visits, resulting in non-monotone patterns of missingness. Moreover, the reasons for missed visits may be related to the specific values of the response and/or covariates that should have been obtained, i.e. missingness is non-ignorable. With non-monotone non-ignorable missing response and covariate data, a full likelihood approach is quite complicated, and maximum likelihood estimation can be computationally prohibitive when there are many occasions of follow-up. Furthermore, the full likelihood must be correctly specified to obtain consistent parameter estimates. We propose a pseudo-likelihood method for jointly estimating the covariate effects on the marginal probabilities of the outcomes and the parameters of the missing data mechanism. The pseudo-likelihood requires specification of the marginal distributions of the missingness indicator, outcome, and possibly missing covariates at each occasions, but avoids making assumptions about the joint distribution of the data at two or more occasions. Thus, the proposed method can be considered semi-parametric. The proposed method is an extension of the pseudo-likelihood approach in Troxel et al. to handle binary responses and possibly missing time-varying covariates. The method is illustrated using data from the Six Cities study, a longitudinal study of the health effects of air pollution.  相似文献   

10.
Multiple imputation is commonly used to impute missing covariate in Cox semiparametric regression setting. It is to fill each missing data with more plausible values, via a Gibbs sampling procedure, specifying an imputation model for each missing variable. This imputation method is implemented in several softwares that offer imputation models steered by the shape of the variable to be imputed, but all these imputation models make an assumption of linearity on covariates effect. However, this assumption is not often verified in practice as the covariates can have a nonlinear effect. Such a linear assumption can lead to a misleading conclusion because imputation model should be constructed to reflect the true distributional relationship between the missing values and the observed values. To estimate nonlinear effects of continuous time invariant covariates in imputation model, we propose a method based on B‐splines function. To assess the performance of this method, we conducted a simulation study, where we compared the multiple imputation method using Bayesian splines imputation model with multiple imputation using Bayesian linear imputation model in survival analysis setting. We evaluated the proposed method on the motivated data set collected in HIV‐infected patients enrolled in an observational cohort study in Senegal, which contains several incomplete variables. We found that our method performs well to estimate hazard ratio compared with the linear imputation methods, when data are missing completely at random, or missing at random. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

11.
When describing longitudinal binary response data, it may be desirable to estimate the cumulative probability of at least one positive response by some time point. For example, in phase I and II human immunodeficiency virus (HIV) vaccine trials, investigators are often interested in the probability of at least one vaccine-induced CD8+ cytotoxic T-lymphocyte (CTL) response to HIV proteins at different times over the course of the trial. In this setting, traditional estimates of the cumulative probabilities have been based on observed proportions. We show that if the missing data mechanism is ignorable, the traditional estimator of the cumulative success probabilities is biased and tends to underestimate a candidate vaccine's ability to induce CTL responses. As an alternative, we propose applying standard optimization techniques to obtain maximum likelihood estimates of the response profiles and, in turn, the cumulative probabilities of interest. Comparisons of the empirical and maximum likelihood estimates are investigated using data from simulations and HIV vaccine trials. We conclude that maximum likelihood offers a more accurate method of estimation, which is especially important in the HIV vaccine setting as cumulative CTL responses will likely be used as a key criterion for large scale efficacy trial qualification.  相似文献   

12.
Heinze G 《Statistics in medicine》2006,25(24):4216-4226
In logistic regression analysis of small or sparse data sets, results obtained by classical maximum likelihood methods cannot be generally trusted. In such analyses it may even happen that the likelihood meets the convergence criteria while at least one parameter estimate diverges to +/-infinity. This situation has been termed 'separation', and it typically occurs whenever no events are observed in one of the two groups defined by a dichotomous covariate. More generally, separation is caused by a linear combination of continuous or dichotomous covariates that perfectly separates events from non-events. Separation implies infinite or zero maximum likelihood estimates of odds ratios, which are usually considered unrealistic. I provide some examples of separation and near-separation in clinical data sets and discuss some options to analyse such data, including exact logistic regression analysis and a penalized likelihood approach. Both methods supply finite point estimates in case of separation. Profile penalized likelihood confidence intervals for parameters show excellent behaviour in terms of coverage probability and provide higher power than exact confidence intervals. General advantages of the penalized likelihood approach are discussed.  相似文献   

13.
Missing covariate data are common in observational studies of time to an event, especially when covariates are repeatedly measured over time. Failure to account for the missing data can lead to bias or loss of efficiency, especially when the data are non-ignorably missing. Previous work has focused on the case of fixed covariates rather than those that are repeatedly measured over the follow-up period, hence, here we present a selection model that allows for proportional hazards regression with time-varying covariates when some covariates may be non-ignorably missing. We develop a fully Bayesian model and obtain posterior estimates of the parameters via the Gibbs sampler in WinBUGS. We illustrate our model with an analysis of post-diagnosis weight change and survival after breast cancer diagnosis in the Long Island Breast Cancer Study Project follow-up study. Our results indicate that post-diagnosis weight gain is associated with lower all-cause and breast cancer-specific survival among women diagnosed with new primary breast cancer. Our sensitivity analysis showed only slight differences between models with different assumptions on the missing data mechanism yet the complete-case analysis yielded markedly different results.  相似文献   

14.
We propose a joint model for longitudinal and survival data with time‐varying covariates subject to detection limits and intermittent missingness at random. The model is motivated by data from the Multicenter AIDS Cohort Study (MACS), in which HIV+ subjects have viral load and CD4 cell count measured at repeated visits along with survival data. We model the longitudinal component using a normal linear mixed model, modeling the trajectory of CD4 cell count by regressing on viral load, and other covariates. The viral load data are subject to both left censoring because of detection limits (17%) and intermittent missingness (27%). The survival component of the joint model is a Cox model with time‐dependent covariates for death because of AIDS. The longitudinal and survival models are linked using the trajectory function of the linear mixed model. A Bayesian analysis is conducted on the MACS data using the proposed joint model. The proposed method is shown to improve the precision of estimates when compared with alternative methods. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

15.
In statistical analysis, a regression model is needed if one is interested in finding the relationship between a response variable and covariates. When the response depends on the covariate, then it may also depend on the function of this covariate. If one has no knowledge of this functional form but expect for monotonic increasing or decreasing, then the isotonic regression model is preferable. Estimation of parameters for isotonic regression models is based on the pool‐adjacent‐violators algorithm (PAVA), where the monotonicity constraints are built in. With missing data, people often employ the augmented estimating method to improve estimation efficiency by incorporating auxiliary information through a working regression model. However, under the framework of the isotonic regression model, the PAVA does not work as the monotonicity constraints are violated. In this paper, we develop an empirical likelihood‐based method for isotonic regression model to incorporate the auxiliary information. Because the monotonicity constraints still hold, the PAVA can be used for parameter estimation. Simulation studies demonstrate that the proposed method can yield more efficient estimates, and in some situations, the efficiency improvement is substantial. We apply this method to a dementia study. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

16.
Ye Ding 《Statistics in medicine》1995,14(14):1505-1512
The method of back-calculation estimates the number of HIV infections from AIDS incidence data and projects future AIDS incidence. We explore a conditional likelihood approach for computing estimates of the number of HIV infections and the parameters in the epidemic density. This method is asymptotically equivalent to the usual likelihood method. The asymptotic normal distribution of the estimates facilitates the computation of confidence intervals. We compute standard deviations for the estimates of HIV incidence and project AIDS incidence from the underlying multinomial distributions. We illustrate the methods with applications to AIDS data in the United States.  相似文献   

17.
When investigating health disparities, it can be of interest to explore whether adjustment for socioeconomic factors at the neighborhood level can account for, or even reverse, an unadjusted difference. Recently, we proposed new methods to adjust the effect of an individual‐level covariate for confounding by unmeasured neighborhood‐level covariates using complex survey data and a generalization of conditional likelihood methods. Generalized linear mixed models (GLMMs) are a popular alternative to conditional likelihood methods in many circumstances. Therefore, in the present article, we propose and investigate a new adaptation of GLMMs for complex survey data that achieves the same goal of adjusting for confounding by unmeasured neighborhood‐level covariates. With the new GLMM approach, one must correctly model the expectation of the unmeasured neighborhood‐level effect as a function of the individual‐level covariates. We demonstrate using simulations that even if that model is correct, census data on the individual‐level covariates are sometimes required for consistent estimation of the effect of the individual‐level covariate. We apply the new methods to investigate disparities in recency of dental cleaning, treated as an ordinal outcome, using data from the 2008 Florida Behavioral Risk Factor Surveillance System (BRFSS) survey. We operationalize neighborhood as zip code and merge the BRFSS data with census data on ZIP Code Tabulated Areas to incorporate census data on the individual‐level covariates. We compare the new results to our previous analysis, which used conditional likelihood methods. We find that the results are qualitatively similar. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

18.
Loss to follow‐up (LTFU) is a common problem in many epidemiological studies. In antiretroviral treatment (ART) programs for patients with human immunodeficiency virus (HIV), mortality estimates can be biased if the LTFU mechanism is non‐ignorable, that is, mortality differs between lost and retained patients. In this setting, routine procedures for handling missing data may lead to biased estimates. To appropriately deal with non‐ignorable LTFU, explicit modeling of the missing data mechanism is needed. This can be based on additional outcome ascertainment for a sample of patients LTFU, for example, through linkage to national registries or through survey‐based methods. In this paper, we demonstrate how this additional information can be used to construct estimators based on inverse probability weights (IPW) or multiple imputation. We use simulations to contrast the performance of the proposed estimators with methods widely used in HIV cohort research for dealing with missing data. The practical implications of our approach are illustrated using South African ART data, which are partially linkable to South African national vital registration data. Our results demonstrate that while IPWs and proper imputation procedures can be easily constructed from additional outcome ascertainment to obtain valid overall estimates, neglecting non‐ignorable LTFU can result in substantial bias. We believe the proposed estimators are readily applicable to a growing number of studies where LTFU is appreciable, but additional outcome data are available through linkage or surveys of patients LTFU. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

19.
Logistic regression is one of the most widely used regression models in practice, but alternatives to conventional maximum likelihood estimation methods may be more appropriate for small or sparse samples. Modification of the logistic regression score function to remove first-order bias is equivalent to penalizing the likelihood by the Jeffreys prior, and yields penalized maximum likelihood estimates (PLEs) that always exist, even in samples in which maximum likelihood estimates (MLEs) are infinite. PLEs are an attractive alternative in small-to-moderate-sized samples, and are preferred to exact conditional MLEs when there are continuous covariates. We present methods to construct confidence intervals (CI) in the penalized multinomial logistic regression model, and compare CI coverage and length for the PLE-based methods to that of conventional MLE-based methods in trinomial logistic regressions with both binary and continuous covariates. Based on simulation studies in sparse data sets, we recommend profile CIs over asymptotic Wald-type intervals for the PLEs in all cases. Furthermore, when finite sample bias and data separation are likely to occur, we prefer PLE profile CIs over MLE methods.  相似文献   

20.
When missing data occur in one or more covariates in a regression model, multiple imputation (MI) is widely advocated as an improvement over complete‐case analysis (CC). We use theoretical arguments and simulation studies to compare these methods with MI implemented under a missing at random assumption. When data are missing completely at random, both methods have negligible bias, and MI is more efficient than CC across a wide range of scenarios. For other missing data mechanisms, bias arises in one or both methods. In our simulation setting, CC is biased towards the null when data are missing at random. However, when missingness is independent of the outcome given the covariates, CC has negligible bias and MI is biased away from the null. With more general missing data mechanisms, bias tends to be smaller for MI than for CC. Since MI is not always better than CC for missing covariate problems, the choice of method should take into account what is known about the missing data mechanism in a particular substantive application. Importantly, the choice of method should not be based on comparison of standard errors. We propose new ways to understand empirical differences between MI and CC, which may provide insights into the appropriateness of the assumptions underlying each method, and we propose a new index for assessing the likely gain in precision from MI: the fraction of incomplete cases among the observed values of a covariate (FICO). Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号