首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A time‐varying latent variable model is proposed to jointly analyze multivariate mixed‐support longitudinal data. The proposal can be viewed as an extension of hidden Markov regression models with fixed covariates (HMRMFCs), which is the state of the art for modelling longitudinal data, with a special focus on the underlying clustering structure. HMRMFCs are inadequate for applications in which a clustering structure can be identified in the distribution of the covariates, as the clustering is independent from the covariates distribution. Here, hidden Markov regression models with random covariates are introduced by explicitly specifying state‐specific distributions for the covariates, with the aim of improving the recovering of the clusters in the data with respect to a fixed covariates paradigm. The hidden Markov regression models with random covariates class is defined focusing on the exponential family, in a generalized linear model framework. Model identifiability conditions are sketched, an expectation‐maximization algorithm is outlined for parameter estimation, and various implementation and operational issues are discussed. Properties of the estimators of the regression coefficients, as well as of the hidden path parameters, are evaluated through simulation experiments and compared with those of HMRMFCs. The method is applied to physical activity data.  相似文献   

2.
This paper presents a Bayesian adaptive group least absolute shrinkage and selection operator method to conduct simultaneous model selection and estimation under semiparametric hidden Markov models. We specify the conditional regression model and the transition probability model in the hidden Markov model into additive nonparametric functions of covariates. A basis expansion is adopted to approximate the nonparametric functions. We introduce multivariate conditional Laplace priors to impose adaptive penalties on regression coefficients and different groups of basis expansions under the Bayesian framework. An efficient Markov chain Monte Carlo algorithm is then proposed to identify the nonexistent, constant, linear, and nonlinear forms of covariate effects in both conditional and transition models. The empirical performance of the proposed methodology is evaluated via simulation studies. We apply the proposed model to analyze a real data set that was collected from the Alzheimer's Disease Neuroimaging Initiative study. The analysis identifies important risk factors on cognitive decline and the transition from cognitive normal to Alzheimer's disease.  相似文献   

3.
Time index‐ordered random variables are said to be antedependent (AD) of order (p1,p2, … ,pn) if the kth variable, conditioned on the pk immediately preceding variables, is independent of all further preceding variables. Inferential methods associated with AD models are well developed for continuous (primarily normal) longitudinal data, but not for categorical longitudinal data. In this article, we develop likelihood‐based inferential procedures for unstructured AD models for categorical longitudinal data. Specifically, we derive maximum likelihood estimators (MLEs) of model parameters; penalized likelihood criteria and likelihood ratio tests for determining the order of antedependence; and likelihood ratio tests for homogeneity across groups, time invariance of transition probabilities, and strict stationarity. We give closed‐form expressions for MLEs and test statistics, which allow for the possibility of empty cells and monotone missing data, for all cases save strict stationarity. For data with an arbitrary missingness pattern, we derive an efficient restricted expectation–maximization algorithm for obtaining MLEs. We evaluate the performance of the tests by simulation. We apply the methods to longitudinal studies of toenail infection severity (measured on a binary scale) and Alzheimer's disease severity (measured on an ordinal scale). The analysis of the toenail infection severity data reveals interesting nonstationary behavior of the transition probabilities and indicates that an unstructured first‐order AD model is superior to stationary and other structured first‐order AD models that have previously been fit to these data. The analysis of the Alzheimer's severity data indicates that the antedependence is second order with time‐invariant transition probabilities, suggesting the use of a second‐order autoregressive cumulative logit model. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

4.
Hidden Markov models (HMMs) are frequently used to analyse longitudinal data, where the same set of subjects is repeatedly observed over time. In this context, several sources of heterogeneity may arise at individual and/or time level, which affect the hidden process, that is, the transition probabilities between the hidden states. In this paper, we propose the use of a finite mixture of non-homogeneous HMMs (NH-HMMs) to face the heterogeneity problem. The non-homogeneity of the model allows us to take into account observed sources of heterogeneity by means of a proper set of covariates, time and/or individual dependent, explaining the variations in the transition probabilities. Moreover, we handle the unobserved sources of heterogeneity at the individual level, due to, for example, omitted covariates, by introducing a random term with a discrete distribution. The resulting model is a finite mixture of NH-HMM that can be used to classify individuals according to their dynamic behaviour or to estimate a mixed NH-HMM without any assumption regarding the distribution of the random term following the non-parametric maximum likelihood approach. We test the effectiveness of the proposal through a simulation study and an application to real data on alcohol abuse.  相似文献   

5.
Questionnaire‐based health status outcomes are often prone to misclassification. When studying the effect of risk factors on such outcomes, ignoring any potential misclassification may lead to biased effect estimates. Analytical challenges posed by these misclassified outcomes are further complicated when simultaneously exploring factors for both the misclassification and health processes in a multi‐level setting. To address these challenges, we propose a fully Bayesian mixed hidden Markov model (BMHMM) for handling differential misclassification in categorical outcomes in a multi‐level setting. The BMHMM generalizes the traditional hidden Markov model (HMM) by introducing random effects into three sets of HMM parameters for joint estimation of the prevalence, transition, and misclassification probabilities. This formulation not only allows joint estimation of all three sets of parameters but also accounts for cluster‐level heterogeneity based on a multi‐level model structure. Using this novel approach, both the true health status prevalence and the transition probabilities between the health states during follow‐up are modeled as functions of covariates. The observed, possibly misclassified, health states are related to the true, but unobserved, health states and covariates. Results from simulation studies are presented to validate the estimation procedure, to show the computational efficiency due to the Bayesian approach and also to illustrate the gains from the proposed method compared to existing methods that ignore outcome misclassification and cluster‐level heterogeneity. We apply the proposed method to examine the risk factors for both asthma transition and misclassification in the Southern California Children's Health Study. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

6.
Many prospective biomedical studies collect longitudinal clinical and lifestyle data that are both continuous and discrete. In some studies, there is interest in the association between a binary outcome and the values of these longitudinal measurements at a specific time point. A common problem in these studies is inconsistency in timing of measurements and missing follow-ups which can lead to few measurements at the time of interest. Some methods have been developed to address this problem, but are only applicable to continuous measurements. To address this limitation, we propose a new class of joint models for a binary outcome and longitudinal explanatory variables of mixed types. The longitudinal model uses a latent normal random variable construction with regression splines to model time-dependent trends in mean with a Dirichlet Process prior assigned to random effects to relax distribution assumptions. We also standardize timing of the explanatory variables by relating the binary outcome to imputed longitudinal values at a set time point. The proposed model is evaluated through simulation studies and applied to data from a cancer survivor study of participants in the Women's Health Initiative.  相似文献   

7.
Background and ObjectivesAs a result of the development of sophisticated techniques, such as multiple imputation, the interest in handling missing data in longitudinal studies has increased enormously in past years. Within the field of longitudinal data analysis, there is a current debate on whether it is necessary to use multiple imputations before performing a mixed-model analysis to analyze the longitudinal data. In the current study this necessity is evaluated.Study Design and SettingThe results of mixed-model analyses with and without multiple imputation were compared with each other. Four data sets with missing values were created—one data set with missing completely at random, two data sets with missing at random, and one data set with missing not at random). In all data sets, the relationship between a continuous outcome variable and two different covariates were analyzed: a time-independent dichotomous covariate and a time-dependent continuous covariate.ResultsAlthough for all types of missing data, the results of the mixed-model analysis with or without multiple imputations were slightly different, they were not in favor of one of the two approaches. In addition, repeating the multiple imputations 100 times showed that the results of the mixed-model analysis with multiple imputation were quite unstable.ConclusionIt is not necessary to handle missing data using multiple imputations before performing a mixed-model analysis on longitudinal data.  相似文献   

8.
We propose a transition model for analysing data from complex longitudinal studies. Because missing values are practically unavoidable in large longitudinal studies, we also present a two-stage imputation method for handling general patterns of missing values on both the outcome and the covariates by combining multiple imputation with stochastic regression imputation. Our model is a time-varying auto-regression on the past innovations (residuals), and it can be used in cases where general dynamics must be taken into account, and where the model selection is important. The entire estimation process was carried out using available procedures in statistical packages such as SAS and S-PLUS. To illustrate the viability of the proposed model and the two-stage imputation method, we analyse data collected in an epidemiological study that focused on various factors relating to childhood growth. Finally, we present a simulation study to investigate the behaviour of our two-stage imputation procedure.  相似文献   

9.
Liu LC 《Statistics in medicine》2008,27(30):6299-6309
In studies where multiple outcome items are repeatedly measured over time, missing data often occur. A longitudinal item response theory model is proposed for analysis of multivariate ordinal outcomes that are repeatedly measured. Under the MAR assumption, this model accommodates missing data at any level (missing item at any time point and/or missing time point). It allows for multiple random subject effects and the estimation of item discrimination parameters for the multiple outcome items. The covariates in the model can be at any level. Assuming either a probit or logistic response function, maximum marginal likelihood estimation is described utilizing multidimensional Gauss-Hermite quadrature for integration of the random effects. An iterative Fisher-scoring solution, which provides standard errors for all model parameters, is used. A data set from a longitudinal prevention study is used to motivate the application of the proposed model. In this study, multiple ordinal items of health behavior are repeatedly measured over time. Because of a planned missing design, subjects answered only two-third of all items at a given point.  相似文献   

10.
Joint modelling of longitudinal and survival data has received much attention in recent years. Most have concentrated on a single longitudinal variable. This paper considers joint modelling in the presence of multiple longitudinal variables. We explore direct association of time-to-event and multiple longitudinal processes through a frailty model and use a mixed effects model for each of the longitudinal variables. Correlations among the longitudinal variables are induced through correlated random effects. We allow effects of categorical and continuous covariates on both longitudinal and time-to-event responses and explore interactions between the longitudinal variables and other covariates on time-to-event. Estimates of the parameters are obtained by maximizing the joint likelihood for the longitudinal variable processes and the event process. We use a one-step-late EM algorithm to handle the direct dependence of the event process on the modelled longitudinal variables along with the presence of other fixed covariates in both processes. We argue that such a joint analysis with multiple longitudinal variables is advantageous to one with only a single longitudinal variable in revealing interplay among multiple longitudinal variables and the time-to-event.  相似文献   

11.
Multiple imputation is commonly used to impute missing covariate in Cox semiparametric regression setting. It is to fill each missing data with more plausible values, via a Gibbs sampling procedure, specifying an imputation model for each missing variable. This imputation method is implemented in several softwares that offer imputation models steered by the shape of the variable to be imputed, but all these imputation models make an assumption of linearity on covariates effect. However, this assumption is not often verified in practice as the covariates can have a nonlinear effect. Such a linear assumption can lead to a misleading conclusion because imputation model should be constructed to reflect the true distributional relationship between the missing values and the observed values. To estimate nonlinear effects of continuous time invariant covariates in imputation model, we propose a method based on B‐splines function. To assess the performance of this method, we conducted a simulation study, where we compared the multiple imputation method using Bayesian splines imputation model with multiple imputation using Bayesian linear imputation model in survival analysis setting. We evaluated the proposed method on the motivated data set collected in HIV‐infected patients enrolled in an observational cohort study in Senegal, which contains several incomplete variables. We found that our method performs well to estimate hazard ratio compared with the linear imputation methods, when data are missing completely at random, or missing at random. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

12.
Estimating causal effects in psychiatric clinical trials is often complicated by treatment non-compliance and missing outcomes. While new estimators have recently been proposed to address these problems, they do not allow for inclusion of continuous covariates. We propose estimators that adjust for continuous covariates in addition to non-compliance and missing data. Using simulations, we compare mean squared errors for the new estimators with those of previously established estimators. We then illustrate our findings in a study examining the efficacy of clozapine versus haloperidol in the treatment of refractory schizophrenia. For data with continuous or binary outcomes in the presence of non-compliance, non-ignorable missing data, and a covariate effect, the new estimators generally performed better than the previously established estimators. In the clozapine trial, the new estimators gave point and interval estimates similar to established estimators. We recommend the new estimators as they are unbiased even when outcomes are not missing at random and they are more efficient than established estimators in the presence of covariate effects under the widest variety of circumstances.  相似文献   

13.
Timeliness of a public health surveillance system is one of its most important characteristics. The process of predicting the present situation using available incomplete information from surveillance systems has received the term nowcasting and has high public health interest. Generally in Europe, general practitioners’ sentinel networks support the epidemiological surveillance of influenza activity, and each week's epidemiological bulletins are usually issued between Wednesday and Friday of the following week. In this work, we have developed a non‐homogeneous hidden Markov model (HMM) that, on a weekly basis, uses as covariates an early observation of influenza‐like illness (ILI) incidence rate and the number of ILI cases tested positive to nowcast the current week ILI rate and the probability that the influenza activity is in an epidemic state. We use Bayesian inference to find estimates of the model parameters and nowcasted quantities. The results obtained with data provided by the Portuguese influenza surveillance system show the additional value of using a non‐homogeneous HMM instead of a homogeneous one. The use of a non‐homogeneous HMM improves the surveillance system timeliness in 2 weeks. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

14.
A significant source of missing data in longitudinal epidemiological studies on elderly individuals is death. Subjects in large scale community-based longitudinal dementia studies are usually evaluated for disease status in study waves, not under continuous surveillance as in traditional cohort studies. Therefore, for the deceased subjects, disease status prior to death cannot be ascertained. Statistical methods assuming deceased subjects to be missing at random may not be realistic in dementia studies and may lead to biased results. We propose a stochastic model approach to simultaneously estimate disease incidence and mortality rates. We set up a Markov chain model consisting of three states, non-diseased, diseased and dead, and estimate the transition hazard parameters using the maximum likelihood approach. Simulation results are presented indicating adequate performance of the proposed approach.  相似文献   

15.
We describe and evaluate a regression tree algorithm for finding subgroups with differential treatments effects in randomized trials with multivariate outcomes. The data may contain missing values in the outcomes and covariates, and the treatment variable is not limited to two levels. Simulation results show that the regression tree models have unbiased variable selection and the estimates of subgroup treatment effects are approximately unbiased. A bootstrap calibration technique is proposed for constructing confidence intervals for the treatment effects. The method is illustrated with data from a longitudinal study comparing two diabetes drugs and a mammography screening trial comparing two treatments and a control. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

16.
In behavioral, biomedical, and social‐psychological sciences, it is common to encounter latent variables and heterogeneous data. Mixture structural equation models (SEMs) are very useful methods to analyze these kinds of data. Moreover, the presence of missing data, including both missing responses and missing covariates, is an important issue in practical research. However, limited work has been done on the analysis of mixture SEMs with non‐ignorable missing responses and covariates. The main objective of this paper is to develop a Bayesian approach for analyzing mixture SEMs with an unknown number of components, in which a multinomial logit model is introduced to assess the influence of some covariates on the component probability. Results of our simulation study show that the Bayesian estimates obtained by the proposed method are accurate, and the model selection procedure via a modified DIC is useful in identifying the correct number of components and in selecting an appropriate missing mechanism in the proposed mixture SEMs. A real data set related to a longitudinal study of polydrug use is employed to illustrate the methodology. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

17.
Tao Lu 《Statistics in medicine》2017,36(16):2614-2629
In AIDS studies, heterogeneous between and within subject variations are often observed on longitudinal endpoints. To accommodate heteroscedasticity in the longitudinal data, statistical methods have been developed to model the mean and variance jointly. Most of these methods assume (conditional) normal distributions for random errors, which is not realistic in practice. In this article, we propose a Bayesian mixed‐effects location scale model with skew‐t distribution and mismeasured covariates for heterogeneous longitudinal data with skewness. The proposed model captures the between‐subject and within‐subject (WS) heterogeneity by modeling the between‐subject and WS variations with covariates as well as a random effect at subject level in the WS variance. Further, the proposed model also takes into account the covariate measurement errors, and commonly assumed normal distributions for model errors are substituted by skew‐t distribution to account for skewness. Parameter estimation is carried out in a Bayesian framework. The proposed method is illustrated with a Multicenter AIDS Cohort Study. Simulation studies are performed to assess the performance of the proposed method. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

18.
Incomplete multi‐level data arise commonly in many clinical trials and observational studies. Because of multi‐level variations in this type of data, appropriate data analysis should take these variations into account. A random effects model can allow for the multi‐level variations by assuming random effects at each level, but the computation is intensive because high‐dimensional integrations are often involved in fitting models. Marginal methods such as the inverse probability weighted generalized estimating equations can involve simple estimation computation, but it is hard to specify the working correlation matrix for multi‐level data. In this paper, we introduce a latent variable method to deal with incomplete multi‐level data when the missing mechanism is missing at random, which fills the gap between the random effects model and marginal models. Latent variable models are built for both the response and missing data processes to incorporate the variations that arise at each level. Simulation studies demonstrate that this method performs well in various situations. We apply the proposed method to an Alzheimer's disease study. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

19.
We propose a semiparametric marginal modeling approach for longitudinal analysis of cohorts with data missing due to death and non‐response to estimate regression parameters interpreted as conditioned on being alive. Our proposed method accommodates outcomes and time‐dependent covariates that are missing not at random with non‐monotone missingness patterns via inverse‐probability weighting. Missing covariates are replaced by consistent estimates derived from a simultaneously solved inverse‐probability‐weighted estimating equation. Thus, we utilize data points with the observed outcomes and missing covariates beyond the estimated weights while avoiding numerical methods to integrate over missing covariates. The approach is applied to a cohort of elderly female hip fracture patients to estimate the prevalence of walking disability over time as a function of body composition, inflammation, and age. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

20.
Song XY  Lee SY  Hser YI 《Statistics in medicine》2008,27(16):3017-3041
The analysis of longitudinal data to study changes in variables measured repeatedly over time has received considerable attention in many fields. This paper proposes a two-level structural equation model for analyzing multivariate longitudinal responses that are mixed continuous and ordered categorical variables. The first-level model is defined for measures taken at each time point nested within individuals for investigating their characteristics that are changed with time. The second level is defined for individuals to assess their characteristics that are invariant with time. The proposed model accommodates fixed covariates, nonlinear terms of the latent variables, and missing data. A maximum likelihood (ML) approach is developed for the estimation of parameters and model comparison. Results of a simulation study indicate that the performance of the ML estimation is satisfactory. The proposed methodology is applied to a longitudinal study concerning cocaine use.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号