首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper we consider longitudinal studies in which the outcome to be measured over time is binary, and the covariates of interest are categorical. In longitudinal studies it is common for the outcomes and any time-varying covariates to be missing due to missed study visits, resulting in non-monotone patterns of missingness. Moreover, the reasons for missed visits may be related to the specific values of the response and/or covariates that should have been obtained, i.e. missingness is non-ignorable. With non-monotone non-ignorable missing response and covariate data, a full likelihood approach is quite complicated, and maximum likelihood estimation can be computationally prohibitive when there are many occasions of follow-up. Furthermore, the full likelihood must be correctly specified to obtain consistent parameter estimates. We propose a pseudo-likelihood method for jointly estimating the covariate effects on the marginal probabilities of the outcomes and the parameters of the missing data mechanism. The pseudo-likelihood requires specification of the marginal distributions of the missingness indicator, outcome, and possibly missing covariates at each occasions, but avoids making assumptions about the joint distribution of the data at two or more occasions. Thus, the proposed method can be considered semi-parametric. The proposed method is an extension of the pseudo-likelihood approach in Troxel et al. to handle binary responses and possibly missing time-varying covariates. The method is illustrated using data from the Six Cities study, a longitudinal study of the health effects of air pollution.  相似文献   

2.
Despite the need for sensitivity analysis to nonignorable missingness in intensive longitudinal data (ILD), such analysis is greatly hindered by novel ILD features, such as large data volume and complex nonmonotonic missing-data patterns. Likelihood of alternative models permitting nonignorable missingness often involves very high-dimensional integrals, causing curse of dimensionality and rendering solutions computationally prohibitive to obtain. We aim to overcome this challenge by developing a computationally feasible method, nonlinear indexes of local sensitivity to nonignorability (NISNI). We use linear mixed effects models for the incomplete outcome and covariates. We use Markov multinomial models to describe complex missing-data patterns and mechanisms in ILD, thereby permitting missingness probabilities to depend directly on missing data. Using a second-order Taylor series to approximate likelihood under nonignorability, we develop formulas and closed-form expressions for NISNI. Our approach permits the outcome and covariates to be missing simultaneously, as is often the case in ILD, and can capture U-shaped impact of nonignorability in the neighborhood of the missing at random model without fitting alternative models or evaluating integrals. We evaluate performance of this method using simulated data and real ILD collected by the ecological momentary assessment method.  相似文献   

3.
In this paper, we analyze a two‐level latent variable model for longitudinal data from the National Growth and Health Study where surrogate outcomes or biomarkers and covariates are subject to missingness at any of the levels. A conventional method for efficient handling of missing data is to re‐express the desired model as a joint distribution of variables, including the biomarkers, that are subject to missingness conditional on all of the covariates that are completely observed, and estimate the joint model by maximum likelihood, which is then transformed to the desired model. The joint model, however, identifies more parameters than desired, in general. We show that the over‐identified joint model produces biased estimation of the latent variable model and describe how to impose constraints on the joint model so that it has a one‐to‐one correspondence with the desired model for unbiased estimation. The constrained joint model handles missing data efficiently under the assumption of ignorable missing data and is estimated by a modified application of the expectation‐maximization algorithm. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

4.
Hong Zhu 《Statistics in medicine》2014,33(14):2467-2479
Regression methods for survival data with right censoring have been extensively studied under semiparametric transformation models such as the Cox regression model and the proportional odds model. However, their practical application could be limited because of possible violation of model assumption or lack of ready interpretation for the regression coefficients in some cases. As an alternative, in this paper, the proportional likelihood ratio model introduced by Luo and Tsai is extended to flexibly model the relationship between survival outcome and covariates. This model has a natural connection with many important semiparametric models such as generalized linear model and density ratio model and is closely related to biased sampling problems. Compared with the semiparametric transformation model, the proportional likelihood ratio model is appealing and practical in many ways because of its model flexibility and quite direct clinical interpretation. We present two likelihood approaches for the estimation and inference on the target regression parameters under independent and dependent censoring assumptions. Based on a conditional likelihood approach using uncensored failure times, a numerically simple estimation procedure is developed by maximizing a pairwise pseudo‐likelihood. We also develop a full likelihood approach, and the most efficient maximum likelihood estimator is obtained by a profile likelihood. Simulation studies are conducted to assess the finite‐sample properties of the proposed estimators and compare the efficiency of the two likelihood approaches. An application to survival data for bone marrow transplantation patients of acute leukemia is provided to illustrate the proposed method and other approaches for handling non‐proportionality. The relative merits of these methods are discussed in concluding remarks. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

5.
We explore the ‘reassessment’ design in a logistic regression setting, where a second wave of sampling is applied to recover a portion of the missing data on a binary exposure and/or outcome variable. We construct a joint likelihood function based on the original model of interest and a model for the missing data mechanism, with emphasis on non‐ignorable missingness. The estimation is carried out by numerical maximization of the joint likelihood function with close approximation of the accompanying Hessian matrix, using sharable programs that take advantage of general optimization routines in standard software. We show how likelihood ratio tests can be used for model selection and how they facilitate direct hypothesis testing for whether missingness is at random. Examples and simulations are presented to demonstrate the performance of the proposed method. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

6.
Missing outcome data is a crucial threat to the validity of treatment effect estimates from randomized trials. The outcome distributions of participants with missing and observed data are often different, which increases bias. Causal inference methods may aid in reducing the bias and improving efficiency by incorporating baseline variables into the analysis. In particular, doubly robust estimators incorporate 2 nuisance parameters: the outcome regression and the missingness mechanism (ie, the probability of missingness conditional on treatment assignment and baseline variables), to adjust for differences in the observed and unobserved groups that can be explained by observed covariates. To consistently estimate the treatment effect, one of these nuisance parameters must be consistently estimated. Traditionally, nuisance parameters are estimated using parametric models, which often precludes consistency, particularly in moderate to high dimensions. Recent research on missing data has focused on data‐adaptive estimation to help achieve consistency, but the large sample properties of such methods are poorly understood. In this article, we discuss a doubly robust estimator that is consistent and asymptotically normal under data‐adaptive estimation of the nuisance parameters. We provide a formula for an asymptotically exact confidence interval under minimal assumptions. We show that our proposed estimator has smaller finite‐sample bias compared to standard doubly robust estimators. We present a simulation study demonstrating the enhanced performance of our estimators in terms of bias, efficiency, and coverage of the confidence intervals. We present the results of an illustrative example: a randomized, double‐blind phase 2/3 trial of antiretroviral therapy in HIV‐infected persons.  相似文献   

7.
Quality-of-life (QOL) is an important outcome in clinical research, particularly in cancer clinical trials. Typically, data are collected longitudinally from patients during treatment and subsequent follow-up. Missing data are a common problem, and missingness may arise in a non-ignorable fashion. In particular, the probability that a patient misses an assessment may depend on the patient's QOL at the time of the scheduled assessment. We propose a Markov chain model for the analysis of categorical outcomes derived from QOL measures. Our model assumes that transitions between QOL states depend on covariates through generalized logit models or proportional odds models. To account for non-ignorable missingness, we incorporate logistic regression models for the conditional probabilities of observing measurements, given their actual values. The model can accommodate time-dependent covariates. Estimation is by maximum likelihood, summing over all possible values of the missing measurements. We describe options for selecting parsimonious models, and we study the finite-sample properties of the estimators by simulation. We apply the techniques to data from a breast cancer clinical trial in which QOL assessments were made longitudinally, and in which missing data frequently arose.  相似文献   

8.
When investigating health disparities, it can be of interest to explore whether adjustment for socioeconomic factors at the neighborhood level can account for, or even reverse, an unadjusted difference. Recently, we proposed new methods to adjust the effect of an individual‐level covariate for confounding by unmeasured neighborhood‐level covariates using complex survey data and a generalization of conditional likelihood methods. Generalized linear mixed models (GLMMs) are a popular alternative to conditional likelihood methods in many circumstances. Therefore, in the present article, we propose and investigate a new adaptation of GLMMs for complex survey data that achieves the same goal of adjusting for confounding by unmeasured neighborhood‐level covariates. With the new GLMM approach, one must correctly model the expectation of the unmeasured neighborhood‐level effect as a function of the individual‐level covariates. We demonstrate using simulations that even if that model is correct, census data on the individual‐level covariates are sometimes required for consistent estimation of the effect of the individual‐level covariate. We apply the new methods to investigate disparities in recency of dental cleaning, treated as an ordinal outcome, using data from the 2008 Florida Behavioral Risk Factor Surveillance System (BRFSS) survey. We operationalize neighborhood as zip code and merge the BRFSS data with census data on ZIP Code Tabulated Areas to incorporate census data on the individual‐level covariates. We compare the new results to our previous analysis, which used conditional likelihood methods. We find that the results are qualitatively similar. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

9.
Proportional hazards models are among the most popular regression models in survival analysis. Multi‐state models generalize them by jointly considering different types of events and their interrelations, whereas frailty models incorporate random effects to account for unobserved risk factors, possibly shared by clusters of subjects. The integration of multi‐state and frailty methodology is an interesting way to control for unobserved heterogeneity in the presence of complex event history structures and is particularly appealing for multicenter clinical trials. We propose the incorporation of correlated frailties in the transition‐specific hazard function, thanks to a nested hierarchy. We studied a semiparametric estimation approach based on maximum integrated partial likelihood. We show in a simulation study that the nested frailty multi‐state model improves the estimation of the effect of covariates, as well as the coverage probability of their confidence intervals. We present a case study concerning a prostate cancer multicenter clinical trial. The multi‐state nature of the model allows us to evidence the effect of treatment on death taking into account intermediate events. Copyright © 2015 JohnWiley & Sons, Ltd.  相似文献   

10.
11.
The use of longitudinal measurements to predict a categorical outcome is an increasingly common goal in research studies. Joint models are commonly used to describe two or more models simultaneously by considering the correlated nature of their outcomes and the random error present in the longitudinal measurements. However, there is limited research on joint models with longitudinal predictors and categorical cross‐sectional outcomes. Perhaps the most challenging task is how to model the longitudinal predictor process such that it represents the true biological mechanism that dictates the association with the categorical response. We propose a joint logistic regression and Markov chain model to describe a binary cross‐sectional response, where the unobserved transition rates of a two‐state continuous‐time Markov chain are included as covariates. We use the method of maximum likelihood to estimate the parameters of our model. In a simulation study, coverage probabilities of about 95%, standard deviations close to standard errors, and low biases for the parameter values show that our estimation method is adequate. We apply the proposed joint model to a dataset of patients with traumatic brain injury to describe and predict a 6‐month outcome based on physiological data collected post‐injury and admission characteristics. Our analysis indicates that the information provided by physiological changes over time may help improve prediction of long‐term functional status of these severely ill subjects. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

12.
Causal inference with observational longitudinal data and time‐varying exposures is complicated due to the potential for time‐dependent confounding and unmeasured confounding. Most causal inference methods that handle time‐dependent confounding rely on either the assumption of no unmeasured confounders or the availability of an unconfounded variable that is associated with the exposure (eg, an instrumental variable). Furthermore, when data are incomplete, validity of many methods often depends on the assumption of missing at random. We propose an approach that combines a parametric joint mixed‐effects model for the study outcome and the exposure with g‐computation to identify and estimate causal effects in the presence of time‐dependent confounding and unmeasured confounding. G‐computation can estimate participant‐specific or population‐average causal effects using parameters of the joint model. The joint model is a type of shared parameter model where the outcome and exposure‐selection models share common random effect(s). We also extend the joint model to handle missing data and truncation by death when missingness is possibly not at random. We evaluate the performance of the proposed method using simulation studies and compare the method to both linear mixed‐ and fixed‐effects models combined with g‐computation as well as to targeted maximum likelihood estimation. We apply the method to an epidemiologic study of vitamin D and depressive symptoms in older adults and include code using SAS PROC NLMIXED software to enhance the accessibility of the method to applied researchers.  相似文献   

13.
For the prognosis of complex diseases, beyond the main effects of genetic (G) and environmental (E) factors, gene‐environment (G‐E) interactions also play an important role. Many approaches have been developed for detecting important G‐E interactions, most of which assume that measurements are complete. In practical data analysis, missingness in E measurements is not uncommon, and failing to properly accommodate such missingness leads to biased estimation and false marker identification. In this study, we conduct G‐E interaction analysis with prognosis data under an accelerated failure time (AFT) model. To accommodate missingness in E measurements, we adopt a nonparametric kernel‐based data augmentation approach. With a well‐designed weighting scheme, a nice “byproduct” is that the proposed approach enjoys a certain robustness property. A penalization approach, which respects the “main effects, interactions” hierarchy, is adopted for selection (of important interactions and main effects) and regularized estimation. The proposed approach has sound interpretations and a solid statistical basis. It outperforms multiple alternatives in simulation. The analysis of TCGA data on lung cancer and melanoma leads to interesting findings and models with superior prediction.  相似文献   

14.
There is often a need to assess the dependence of standard analyses on the strong untestable assumption of ignorable missingness. To tackle this problem, past research developed simple sensitivity index measures assuming a linear impact of nonignorability and missingness in outcomes only. These restrictions limit their applicability for studies with missingness in both outcome and covariates. Nonignorable missingness in this setting poses significant new analytic challenges and calls for more general and flexible methods that are also computationally tractable even for large datasets. In this paper, we relax the restrictions of extant linear sensitivity index methods and develop nonlinear sensitivity indices that maintain computational simplicity and perform equally well when the impact of nonignorability is locally linear. On the other hand, they can substantially improve the effectiveness of local sensitivity analysis when regression outcomes and covariates are subject to concurrent missingness. In this situation, the local linear sensitivity analysis fails to detect the impact of nonignorability while the proposed nonlinear sensitivity measures can. Because the new sensitivity indices avoid fitting complicated nonignorable models, they are computationally tractable (i.e., scalable) for use in large datasets. We develop general formula for nonlinear sensitivity index measures, and evaluate the new measures in simulated data and a real dataset collected using the ecological momentary assessment method. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

15.
We propose a new weighted hurdle regression method for modeling count data, with particular interest in modeling cardiovascular events in patients on dialysis. Cardiovascular disease remains one of the leading causes of hospitalization and death in this population. Our aim is to jointly model the relationship/association between covariates and (i) the probability of cardiovascular events, a binary process, and (ii) the rate of events once the realization is positive—when the ‘hurdle’ is crossed—using a zero‐truncated Poisson distribution. When the observation period or follow‐up time, from the start of dialysis, varies among individuals, the estimated probability of positive cardiovascular events during the study period will be biased. Furthermore, when the model contains covariates, then the estimated relationship between the covariates and the probability of cardiovascular events will also be biased. These challenges are addressed with the proposed weighted hurdle regression method. Estimation for the weighted hurdle regression model is a weighted likelihood approach, where standard maximum likelihood estimation can be utilized. The method is illustrated with data from the United States Renal Data System. Simulation studies show the ability of proposed method to successfully adjust for differential follow‐up times and incorporate the effects of covariates in the weighting. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

16.
Multiple imputation is a strategy for the analysis of incomplete data such that the impact of the missingness on the power and bias of estimates is mitigated. When data from multiple studies are collated, we can propose both within‐study and multilevel imputation models to impute missing data on covariates. It is not clear how to choose between imputation models or how to combine imputation and inverse‐variance weighted meta‐analysis methods. This is especially important as often different studies measure data on different variables, meaning that we may need to impute data on a variable which is systematically missing in a particular study. In this paper, we consider a simulation analysis of sporadically missing data in a single covariate with a linear analysis model and discuss how the results would be applicable to the case of systematically missing data. We find in this context that ensuring the congeniality of the imputation and analysis models is important to give correct standard errors and confidence intervals. For example, if the analysis model allows between‐study heterogeneity of a parameter, then we should incorporate this heterogeneity into the imputation model to maintain the congeniality of the two models. In an inverse‐variance weighted meta‐analysis, we should impute missing data and apply Rubin's rules at the study level prior to meta‐analysis, rather than meta‐analyzing each of the multiple imputations and then combining the meta‐analysis estimates using Rubin's rules. We illustrate the results using data from the Emerging Risk Factors Collaboration. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.  相似文献   

17.
We consider random effects meta‐analysis where the outcome variable is the occurrence of some event of interest. The data structures handled are where one has one or more groups in each study, and in each group either the number of subjects with and without the event, or the number of events and the total duration of follow‐up is available. Traditionally, the meta‐analysis follows the summary measures approach based on the estimates of the outcome measure(s) and the corresponding standard error(s). This approach assumes an approximate normal within‐study likelihood and treats the standard errors as known. This approach has several potential disadvantages, such as not accounting for the standard errors being estimated, not accounting for correlation between the estimate and the standard error, the use of an (arbitrary) continuity correction in case of zero events, and the normal approximation being bad in studies with few events. We show that these problems can be overcome in most cases occurring in practice by replacing the approximate normal within‐study likelihood by the appropriate exact likelihood. This leads to a generalized linear mixed model that can be fitted in standard statistical software. For instance, in the case of odds ratio meta‐analysis, one can use the non‐central hypergeometric distribution likelihood leading to mixed‐effects conditional logistic regression. For incidence rate ratio meta‐analysis, it leads to random effects logistic regression with an offset variable. We also present bivariate and multivariate extensions. We present a number of examples, especially with rare events, among which an example of network meta‐analysis. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

18.
We propose a joint model for longitudinal and survival data with time‐varying covariates subject to detection limits and intermittent missingness at random. The model is motivated by data from the Multicenter AIDS Cohort Study (MACS), in which HIV+ subjects have viral load and CD4 cell count measured at repeated visits along with survival data. We model the longitudinal component using a normal linear mixed model, modeling the trajectory of CD4 cell count by regressing on viral load, and other covariates. The viral load data are subject to both left censoring because of detection limits (17%) and intermittent missingness (27%). The survival component of the joint model is a Cox model with time‐dependent covariates for death because of AIDS. The longitudinal and survival models are linked using the trajectory function of the linear mixed model. A Bayesian analysis is conducted on the MACS data using the proposed joint model. The proposed method is shown to improve the precision of estimates when compared with alternative methods. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

19.
The linear mixed effects model based on a full likelihood is one of the few methods available to model longitudinal data subject to left censoring. However, a full likelihood approach is complicated algebraically because of the large dimension of the numeric computations, and maximum likelihood estimation can be computationally prohibitive when the data are heavily censored. Moreover, for mixed models, the complexity of the computation increases as the dimension of the random effects in the model increases. We propose a method based on pseudo likelihood that simplifies the computational complexities, allows a wide class of multivariate models, and that can be used for many different data structures including settings where the level of censoring is high. The motivation for this work comes from the need for a joint model to assess the joint effect of pro‐inflammatory and anti‐inflammatory biomarker data on 30‐day mortality status while simultaneously accounting for longitudinal left censoring and correlation between markers in the analysis of Genetic and Inflammatory Markers for Sepsis study conducted at the University of Pittsburgh. Two markers, interleukin‐6 and interleukin‐10, which naturally are correlated because of a shared similar biological pathways and are left‐censored because of the limited sensitivity of the assays, are considered to determine if higher levels of these markers is associated with an increased risk of death after accounting for the left censoring and their assumed correlation. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

20.
We propose a semiparametric marginal modeling approach for longitudinal analysis of cohorts with data missing due to death and non‐response to estimate regression parameters interpreted as conditioned on being alive. Our proposed method accommodates outcomes and time‐dependent covariates that are missing not at random with non‐monotone missingness patterns via inverse‐probability weighting. Missing covariates are replaced by consistent estimates derived from a simultaneously solved inverse‐probability‐weighted estimating equation. Thus, we utilize data points with the observed outcomes and missing covariates beyond the estimated weights while avoiding numerical methods to integrate over missing covariates. The approach is applied to a cohort of elderly female hip fracture patients to estimate the prevalence of walking disability over time as a function of body composition, inflammation, and age. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号