首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Multiple imputation is commonly used to impute missing covariate in Cox semiparametric regression setting. It is to fill each missing data with more plausible values, via a Gibbs sampling procedure, specifying an imputation model for each missing variable. This imputation method is implemented in several softwares that offer imputation models steered by the shape of the variable to be imputed, but all these imputation models make an assumption of linearity on covariates effect. However, this assumption is not often verified in practice as the covariates can have a nonlinear effect. Such a linear assumption can lead to a misleading conclusion because imputation model should be constructed to reflect the true distributional relationship between the missing values and the observed values. To estimate nonlinear effects of continuous time invariant covariates in imputation model, we propose a method based on B‐splines function. To assess the performance of this method, we conducted a simulation study, where we compared the multiple imputation method using Bayesian splines imputation model with multiple imputation using Bayesian linear imputation model in survival analysis setting. We evaluated the proposed method on the motivated data set collected in HIV‐infected patients enrolled in an observational cohort study in Senegal, which contains several incomplete variables. We found that our method performs well to estimate hazard ratio compared with the linear imputation methods, when data are missing completely at random, or missing at random. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

2.
Infant birth weight and gestational age are two important variables in obstetric research. The primary measure of gestational age used in US birth data is based on a mother's recall of her last menstrual period, which has been shown to introduce random or systematic errors. To mitigate some of those errors, Oja et al., Platt et al., and Tentoni et al. estimated the probabilities of gestational ages being misreported under the assumption that the distribution of infant birth weights for a true gestational age is approximately Gaussian. From this assumption, Oja et al. fitted a three‐component mixture model, and Tentoni et al. and Platt et al. fitted two‐component mixture models. We build on their methods and develop a Bayesian mixture model. We then extend our methods using reversible jump Markov chain Monte Carlo to incorporate the uncertainty in the number of components in the model. We conduct simulation studies and apply our methods to singleton births with reported gestational ages of 23–32 weeks using 2001–2008 US birth data. Results show that a three‐component mixture model fits the birth data better for gestational ages reported as 25 weeks or less; and a two‐component mixture model fits better for the higher gestational ages. Under the assumption that our Bayesian mixture models are appropriate for US birth data, our research provides useful statistical tools to identify records with implausible gestational ages, and the techniques can be used in part of a multiple‐imputation procedure for missing and implausible gestational ages. Published 2012. This article is a US Government work and is in the public domain in the USA.  相似文献   

3.
Recently, structural equation models (SEMs) have been applied for analyzing interrelationships among observed and latent variables in biological and medical research. Latent variables in these models are typically assumed to have a normal distribution. This article considers a Bayesian semparametric SEM with covariates, and mixed continuous and unordered categorical variables, in which the explanatory latent variables in the structural equation are modeled via an appropriate truncated Dirichlet process with a stick‐breaking procedure. Results obtained from a simulation study and an analysis of a real medical data set are presented to illustrate the methodology. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

4.
We present a model for meta‐regression in the presence of missing information on some of the study level covariates, obtaining inferences using Bayesian methods. In practice, when confronted with missing covariate data in a meta‐regression, it is common to carry out a complete case or available case analysis. We propose to use the full observed data, modelling the joint density as a factorization of a meta‐regression model and a conditional factorization of the density for the covariates. With the inclusion of several covariates, inter‐relations between these covariates are modelled. Under this joint likelihood‐based approach, it is shown that the lesser assumption of the covariates being Missing At Random is imposed, instead of the more usual Missing Completely At Random (MCAR) assumption. The model is easily programmable in WinBUGS, and we examine, through the analysis of two real data sets, sensitivity and robustness of results to the MCAR assumption. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

5.
Missing covariate data are common in observational studies of time to an event, especially when covariates are repeatedly measured over time. Failure to account for the missing data can lead to bias or loss of efficiency, especially when the data are non-ignorably missing. Previous work has focused on the case of fixed covariates rather than those that are repeatedly measured over the follow-up period, hence, here we present a selection model that allows for proportional hazards regression with time-varying covariates when some covariates may be non-ignorably missing. We develop a fully Bayesian model and obtain posterior estimates of the parameters via the Gibbs sampler in WinBUGS. We illustrate our model with an analysis of post-diagnosis weight change and survival after breast cancer diagnosis in the Long Island Breast Cancer Study Project follow-up study. Our results indicate that post-diagnosis weight gain is associated with lower all-cause and breast cancer-specific survival among women diagnosed with new primary breast cancer. Our sensitivity analysis showed only slight differences between models with different assumptions on the missing data mechanism yet the complete-case analysis yielded markedly different results.  相似文献   

6.
With reference to a questionnaire that aimed to assess the quality of life for dysarthric speakers, we investigate the usefulness of a model‐based procedure for reducing the number of items. We propose a mixed cumulative logit model, which is known in the psychometrics literature as the graded response model: responses to different items are modelled as a function of individual latent traits and as a function of item characteristics, such as their difficulty and their discrimination power. We jointly model the discrimination and the difficulty parameters by using a k‐component mixture of normal distributions. Mixture components correspond to disjoint groups of items. Items that belong to the same groups can be considered equivalent in terms of both difficulty and discrimination power. According to decision criteria, we select a subset of items such that the reduced questionnaire is able to provide the same information that the complete questionnaire provides. The model is estimated by using a Bayesian approach, and the choice of the number of mixture components is justified according to information criteria. We illustrate the proposed approach on the basis of data that are collected for 104 dysarthric patients by local health authorities in Lecce and in Milan. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

7.
Several approaches exist for handling missing covariates in the Cox proportional hazards model. The multiple imputation (MI) is relatively easy to implement with various software available and results in consistent estimates if the imputation model is correct. On the other hand, the fully augmented weighted estimators (FAWEs) recover a substantial proportion of the efficiency and have the doubly robust property. In this paper, we compare the FAWEs and the MI through a comprehensive simulation study. For the MI, we consider the multiple imputation by chained equation and focus on two imputation methods: Bayesian linear regression imputation and predictive mean matching. Simulation results show that the imputation methods can be rather sensitive to model misspecification and may have large bias when the censoring time depends on the missing covariates. In contrast, the FAWEs allow the censoring time to depend on the missing covariates and are remarkably robust as long as getting either the conditional expectations or the selection probability correct due to the doubly robust property. The comparison suggests that the FAWEs show the potential for being a competitive and attractive tool for tackling the analysis of survival data with missing covariates. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

8.
This study develops a two-part hidden Markov model (HMM) for analyzing semicontinuous longitudinal data in the presence of missing covariates. The proposed model manages a semicontinuous variable by splitting it into two random variables: a binary indicator for determining the occurrence of excess zeros at all occasions and a continuous random variable for examining its actual level. For the continuous longitudinal response, an HMM is proposed to describe the relationship between the observation and unobservable finite-state transition processes. The HMM consists of two major components. The first component is a transition model for investigating how potential covariates influence the probabilities of transitioning from one hidden state to another. The second component is a conditional regression model for examining the state-specific effects of covariates on the response. A shared random effect is introduced to each part of the model to accommodate possible unobservable heterogeneity among observation processes and the nonignorability of missing covariates. A Bayesian adaptive least absolute shrinkage and selection operator (lasso) procedure is developed to conduct simultaneous variable selection and estimation. The proposed methodology is applied to a study on the Alzheimer's Disease Neuroimaging Initiative dataset. New insights into the pathology of Alzheimer's disease and its potential risk factors are obtained.  相似文献   

9.
In this paper, we develop estimation procedure for the parameters of a zero‐inflated over‐dispersed/under‐dispersed count model in the presence of missing responses. In particular, we deal with a zero‐inflated extended negative binomial model in the presence of missing responses. A weighted expectation maximization algorithm is used for the maximum likelihood estimation of the parameters involved. Some simulations are conducted to study the properties of the estimators. Robustness of the procedure is shown when count data follow other over‐dispersed models, such as the log‐normal mixture of the Poisson distribution or even from a zero‐inflated Poisson model. An illustrative example and a discussion leading to some conclusions are given. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

10.
Inpatient care is a large share of total health care spending, making analysis of inpatient utilization patterns an important part of understanding what drives health care spending growth. Common features of inpatient utilization measures such as length of stay and spending include zero inflation, overdispersion, and skewness, all of which complicate statistical modeling. Moreover, latent subgroups of patients may have distinct patterns of utilization and relationships between that utilization and observed covariates. In this work, we apply and compare likelihood-based and parametric Bayesian mixtures of negative binomial and zero-inflated negative binomial regression models. In a simulation, we find that the Bayesian approach finds the true number of mixture components more accurately than using information criteria to select among likelihood-based finite mixture models. When we apply the models to data on hospital lengths of stay for patients with lung cancer, we find distinct subgroups of patients with different means and variances of hospital days, health and treatment covariates, and relationships between covariates and length of stay.  相似文献   

11.
Missing outcome data are a common threat to the validity of the results from randomised controlled trials (RCTs), which, if not analysed appropriately, can lead to misleading treatment effect estimates. Studies with missing outcome data also threaten the validity of any meta‐analysis that includes them. A conceptually simple Bayesian framework is proposed, to account for uncertainty due to missing binary outcome data in meta‐analysis. A pattern‐mixture model is fitted, which allows the incorporation of prior information on a parameter describing the missingness mechanism. We describe several alternative parameterisations, with the simplest being a prior on the probability of an event in the missing individuals. We describe a series of structural assumptions that can be made concerning the missingness parameters. We use some artificial data scenarios to demonstrate the ability of the model to produce a bias‐adjusted estimate of treatment effect that accounts for uncertainty. A meta‐analysis of haloperidol versus placebo for schizophrenia is used to illustrate the model. We end with a discussion of elicitation of priors, issues with poor reporting and potential extensions of the framework. Our framework allows one to make the best use of evidence produced from RCTs with missing outcome data in a meta‐analysis, accounts for any uncertainty induced by missing data and fits easily into a wider evidence synthesis framework for medical decision making. © 2015 The Authors. Statistics in MedicinePublished by John Wiley & Sons Ltd.  相似文献   

12.
We propose a Bayesian hierarchical model for the calculation of incidence counts from mortality data by a convolution equation that expresses mortality through its relationship with incidence and the survival probability density. The basic idea is to use mortality data together with an estimate of the survival distribution from cancer incidence to cancer mortality to reconstruct the numbers of individuals who constitute previously incident cases that give rise to the observed pattern of cancer mortality. This model is novel because it takes into account the uncertainty from the survival distribution; thus, a Bayesian‐mixture cure model for survival is introduced. Furthermore, projections are obtained starting from a Bayesian age‐period‐cohort model. The main advantage of the proposed approach is its consideration of the three components of the model: the convolution equation, the survival mixture cure model and the age‐period‐cohort projection within a directed acyclic graph model. Furthermore, the estimation are obtained through the Gibbs sampler. We applied the model to cases of women with stomach cancer using six age classes [15–45], [45–55], [55–65], [65–75], [75–85] and [85–95] and validated it by using data from the Tuscany Cancer Registry. The model proposed and the program implemented are convenient because they allow different cancer disease to be analysed because the survival time is modelled by flexible distributions that are able to describe different trends. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

13.
In matched case‐crossover studies, it is generally accepted that the covariates on which a case and associated controls are matched cannot exert a confounding effect on independent predictors included in the conditional logistic regression model. This is because any stratum effect is removed by the conditioning on the fixed number of sets of the case and controls in the stratum. Hence, the conditional logistic regression model is not able to detect any effects associated with the matching covariates by stratum. However, some matching covariates such as time often play an important role as an effect modification leading to incorrect statistical estimation and prediction. Therefore, we propose three approaches to evaluate effect modification by time. The first is a parametric approach, the second is a semiparametric penalized approach, and the third is a semiparametric Bayesian approach. Our parametric approach is a two‐stage method, which uses conditional logistic regression in the first stage and then estimates polynomial regression in the second stage. Our semiparametric penalized and Bayesian approaches are one‐stage approaches developed by using regression splines. Our semiparametric one stage approach allows us to not only detect the parametric relationship between the predictor and binary outcomes, but also evaluate nonparametric relationships between the predictor and time. We demonstrate the advantage of our semiparametric one‐stage approaches using both a simulation study and an epidemiological example of a 1‐4 bi‐directional case‐crossover study of childhood aseptic meningitis with drinking water turbidity. We also provide statistical inference for the semiparametric Bayesian approach using Bayes Factors. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

14.
Common problems to many longitudinal HIV/AIDS, cancer, vaccine, and environmental exposure studies are the presence of a lower limit of quantification of an outcome with skewness and time‐varying covariates with measurement errors. There has been relatively little work published simultaneously dealing with these features of longitudinal data. In particular, left‐censored data falling below a limit of detection may sometimes have a proportion larger than expected under a usually assumed log‐normal distribution. In such cases, alternative models, which can account for a high proportion of censored data, should be considered. In this article, we present an extension of the Tobit model that incorporates a mixture of true undetectable observations and those values from a skew‐normal distribution for an outcome with possible left censoring and skewness, and covariates with substantial measurement error. To quantify the covariate process, we offer a flexible nonparametric mixed‐effects model within the Tobit framework. A Bayesian modeling approach is used to assess the simultaneous impact of left censoring, skewness, and measurement error in covariates on inference. The proposed methods are illustrated using real data from an AIDS clinical study. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

15.
In this research article, we propose a class of models for positive and zero responses by means of a zero‐augmented mixed regression model. Under this class, we are particularly interested in studying positive responses whose distribution accommodates skewness. At the same time, responses can be zero, and therefore, we justify the use of a zero‐augmented mixture model. We model the mean of the positive response in a logarithmic scale and the mixture probability in a logit scale, both as a function of fixed and random effects. Moreover, the random effects link the two random components through their joint distribution and incorporate within‐subject correlation because of the repeated measurements and between‐subject heterogeneity. A Markov chain Monte Carlo algorithm is tailored to obtain Bayesian posterior distributions of the unknown quantities of interest, and Bayesian case‐deletion influence diagnostics based on the q‐divergence measure is performed. We apply the proposed method to a dataset from a 24hour dietary recall study conducted in the city of São Paulo and present a simulation study to evaluate the performance of the proposed methods. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

16.
Pattern‐mixture models provide a general and flexible framework for sensitivity analyses of nonignorable missing data in longitudinal studies. The delta‐adjusted pattern‐mixture models handle missing data in a clinically interpretable manner and have been used as sensitivity analyses addressing the effectiveness hypothesis, while a likelihood‐based approach that assumes data are missing at random is often used as the primary analysis addressing the efficacy hypothesis. We describe a method for power calculations for delta‐adjusted pattern‐mixture model sensitivity analyses in confirmatory clinical trials. To apply the method, we only need to specify the pattern probabilities at postbaseline time points, the expected treatment differences at postbaseline time points, the conditional covariance matrix of postbaseline measurements given the baseline measurement, and the delta‐adjustment method for the pattern‐mixture model. We use an example to illustrate and compare various delta‐adjusted pattern‐mixture models and use simulations to confirm the analytic results. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

17.
Individuals may vary in their responses to treatment, and identification of subgroups differentially affected by a treatment is an important issue in medical research. The risk of misleading subgroup analyses has become well known, and some exploratory analyses can be helpful in clarifying how covariates potentially interact with the treatment. Motivated by a real data study of pediatric kidney transplant, we consider a semiparametric Bayesian latent model and examine its utility for an exploratory subgroup effect analysis using secondary data. The proposed method is concerned with a clinical setting where the number of subgroups is much smaller than that of potential predictors and subgroups are only latently associated with observed covariates. The semiparametric model is flexible in capturing the latent structure driven by data rather than dictated by parametric modeling assumptions. Since it is difficult to correctly specify the conditional relationship between the response and a large number of confounders in modeling, we use propensity score matching to improve the model robustness by balancing the covariates distribution. Simulation studies show that the proposed analysis can find the latent subgrouping structure and, with propensity score matching adjustment, yield robust estimates even when the outcome model is misspecified. In the real data analysis, the proposed analysis reports significant subgroup effects on steroid avoidance in kidney transplant patients, whereas standard proportional hazards regression analysis does not.  相似文献   

18.
To provide a comprehensive framework for analysing complex non-normal medical and biological data, we propose a Bayesian approach for a non-linear latent variable model with covariates, and non-ignorable missing data, under the exponential family of distributions. The non-ignorable missing mechanism is defined via a logistic regression model. Based on conjugate prior distributions, full conditional distributions for the implementation of Markov chain Monte Carlo methods in simulating observations from the joint posterior distribution are derived. These observations are used in computing the Bayesian estimates, as well as in implementing a path sampling procedure to evaluate the Bayes factor for model comparison. The proposed methods are illustrated using real data from a study on the non-adherence of hypertension patients.  相似文献   

19.
20.
We present a case study in the analysis of the prognostic effects of anaemia and other covariates on the local recurrence of head and neck cancer in patients who have been treated with radiation therapy. Because it is believed that a large fraction of the patients are cured by the therapy, we use a failure time mixture model for the outcomes, which simultaneously models both the relationship of the covariates to cure and the relationship of the covariates to local recurrence times for subjects who are not cured. A problematic feature of the data is that two covariates of interest having missing values, so that only 75 per cent of the subjects have complete data. We handle the missing-data problem by jointly modelling the covariates and the outcomes, and then fitting the model to all of the data, including the incomplete cases. We compare our approach to two traditional methods for handling missingness, that is, complete-case analysis and the use of an indicator variable for missingness. The comparison with complete-case analysis demonstrates gains in efficiency for joint modelling as well as sensitivity of some results to the method used to handle missing data. The use of an indicator variable yields results that are very similar to those from joint modelling for our data. We also compare the results obtained for the mixture model with results obtained for a standard (non-mixture) survival model. It is seen that the mixture model separates out effects in a way that is not possible with a standard survival model. In particular, conditional on other covariates, we find strong evidence of an association between anaemia and cure, whereas the evidence of an association between anaemia and time to local recurrence for patients who are not cured is weaker.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号