首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In this paper we address two issues arising in multi-state models with covariates. The first issue deals with how to obtain parsimony in the modeling of the effect of covariates. The standard way of incorporating covariates in multi-state models is by considering the transitions as separate building blocks, and modeling the effect of covariates for each transition separately, usually through a proportional hazards model for the transition hazard. This typically leads to a large number of regression coefficients to be estimated, and there is a real danger of over-fitting, especially when transitions with few events are present. We extend the reduced-rank ideas, proposed earlier in the context of competing risks, to multi-state models, in order to deal with this issue. The second issue addressed in this paper was motivated by the wish to obtain standard errors of the regression coefficients of the reduced-rank model. We propose a model-based resampling technique, based on repeatedly sampling trajectories through the multi-state model. The same ideas are also used for the estimation of prediction probabilities in general multi-state models and associated standard errors.We use data from the European Group for Blood and Marrow Transplantation to illustrate our techniques.  相似文献   

2.
In survival analysis, a competing risk is an event whose occurrence precludes the occurrence of the primary event of interest. Outcomes in medical research are frequently subject to competing risks. In survival analysis, there are 2 key questions that can be addressed using competing risk regression models: first, which covariates affect the rate at which events occur, and second, which covariates affect the probability of an event occurring over time. The cause‐specific hazard model estimates the effect of covariates on the rate at which events occur in subjects who are currently event‐free. Subdistribution hazard ratios obtained from the Fine‐Gray model describe the relative effect of covariates on the subdistribution hazard function. Hence, the covariates in this model can also be interpreted as having an effect on the cumulative incidence function or on the probability of events occurring over time. We conducted a review of the use and interpretation of the Fine‐Gray subdistribution hazard model in articles published in the medical literature in 2015. We found that many authors provided an unclear or incorrect interpretation of the regression coefficients associated with this model. An incorrect and inconsistent interpretation of regression coefficients may lead to confusion when comparing results across different studies. Furthermore, an incorrect interpretation of estimated regression coefficients can result in an incorrect understanding about the magnitude of the association between exposure and the incidence of the outcome. The objective of this article is to clarify how these regression coefficients should be reported and to propose suggestions for interpreting these coefficients.  相似文献   

3.
We propose a new weighted hurdle regression method for modeling count data, with particular interest in modeling cardiovascular events in patients on dialysis. Cardiovascular disease remains one of the leading causes of hospitalization and death in this population. Our aim is to jointly model the relationship/association between covariates and (i) the probability of cardiovascular events, a binary process, and (ii) the rate of events once the realization is positive—when the ‘hurdle’ is crossed—using a zero‐truncated Poisson distribution. When the observation period or follow‐up time, from the start of dialysis, varies among individuals, the estimated probability of positive cardiovascular events during the study period will be biased. Furthermore, when the model contains covariates, then the estimated relationship between the covariates and the probability of cardiovascular events will also be biased. These challenges are addressed with the proposed weighted hurdle regression method. Estimation for the weighted hurdle regression model is a weighted likelihood approach, where standard maximum likelihood estimation can be utilized. The method is illustrated with data from the United States Renal Data System. Simulation studies show the ability of proposed method to successfully adjust for differential follow‐up times and incorporate the effects of covariates in the weighting. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

4.
Wang CY  Huang Y 《Statistics in medicine》2003,22(16):2577-2590
We consider regression analysis of a disease outcome in relation to longitudinal data which are observations from a random effects model. The covariate variables of interest are the values of the underlying trajectory at some time points, which may be fixed or subject-specific. Because the underlying random coefficients are unknown, the covariates to the primary model are generally unobserved. In addition, measurements are often not observed at the time points of interest. A motivating example to our model is the effects of age at adiposity rebound and the associated body mass index on the risk of adult obesity. The adiposity rebound is a time point at which the trajectory of a child's body fatness declines to a minimum. This general error in timing problem may be applied to an analysis when time-dependent marker variables follow a polynomial model in which the effect of a local maximum or minimum point may be of interest. It can be seen that directly applying estimated covariates, possibly obtained from estimated time points, may lead to bias. Estimation procedures based on expected estimating equations, regression calibration and simulation extrapolation are applied to this problem.  相似文献   

5.
Kim I  Cheong HK  Kim H 《Statistics in medicine》2011,30(15):1837-1851
In matched case-crossover studies, it is generally accepted that covariates on which a case and associated controls are matched cannot exert a confounding effect on independent predictors included in the conditional logistic regression model because any stratum effect is removed by the conditioning on the fixed number of sets of a case and controls in the stratum. Hence, the conditional logistic regression model is not able to detect any effects associated with the matching covariates by stratum. In addition, the matching covariates may be effect modification and the methods for assessing and characterizing effect modification by matching covariates are quite limited. In this article, we propose a unified approach in its ability to detect both parametric and nonparametric relationships between the predictor and the relative risk of disease or binary outcome, as well as potential effect modifications by matching covariates. Two methods are developed using two semiparametric models: (1) the regression spline varying coefficients model and (2) the regression spline interaction model. Simulation results show that the two approaches are comparable. These methods can be used in any matched case-control study and extend to multilevel effect modification studies. We demonstrate the advantage of our approach using an epidemiological example of a 1-4 bi-directional case-crossover study of childhood aseptic meningitis associated with drinking water turbidity.  相似文献   

6.
Earlier work showed how to perform fixed-effects meta-analysis of studies or trials when each provides results on more than one outcome per patient and these multiple outcomes are correlated. That fixed-effects generalized-least-squares approach analyzes the multiple outcomes jointly within a single model, and it can include covariates, such as duration of therapy or quality of trial, that may explain observed heterogeneity of results among the trials. Sometimes the covariates explain all the heterogeneity, and the fixed-effects regression model is appropriate. However, unexplained heterogeneity may often remain, even after taking into account known or suspected covariates. Because fixed-effects models do not make allowance for this remaining unexplained heterogeneity, the potential exists for bias in estimated coefficients, standard errors and p-values. We propose two random-effects approaches for the regression meta-analysis of multiple correlated outcomes. We compare their use with fixed-effects models and with separate-outcomes models in a meta-analysis of periodontal clinical trials. A simulation study shows the advantages of the random-effects approach. These methods also facilitate meta-analysis of trials that compare more than two treatments. © 1998 John Wiley & Sons, Ltd.  相似文献   

7.
The assessment of the dose-response relationship is important but not straightforward when the therapeutic agent is administered repeatedly with dose-modification in each patient and a continuous response is measured repeatedly. We recently proposed an autoregressive linear mixed effects model for such data in which the current response is regressed on the previous response, fixed effects, and random effects. The model represents profiles approaching each patient's asymptote, takes into account the past dose history, and provides a dose-response relationship of the asymptote as a summary measure. In an autoregressive model, intermittent missing data mean the missing values in previous responses as covariates. We previously provided the marginal (unconditional on the previous response) form of the proposed model to deal with intermittent missing data. Irregular timings of dose-modification or measurement can also be treated as equally spaced data with intermittent missing values by selecting an adequately small unit of time. The likelihood is, however, expressed by matrices whose sizes depend on the number of observations for a patient, and the computational burden is large. In this study, we propose a state space form of the autoregressive linear mixed effects model to calculate the marginal likelihood without using large matrices. The regression coefficients of the fixed effects can be concentrated out of the likelihood in this model by the same way of a linear mixed effects model. As an illustration of the approach, we analyzed immunologic data from a clinical trial for multiple sclerosis patients and estimated the dose-response curves for each patient and the population mean.  相似文献   

8.
Recently, the number of clinical prediction models sharing the same regression task has increased in the medical literature. However, evidence synthesis methodologies that use the results of these regression models have not been sufficiently studied, particularly in meta‐analysis settings where only regression coefficients are available. One of the difficulties lies in the differences between the categorization schemes of continuous covariates across different studies. In general, categorization methods using cutoff values are study specific across available models, even if they focus on the same covariates of interest. Differences in the categorization of covariates could lead to serious bias in the estimated regression coefficients and thus in subsequent syntheses. To tackle this issue, we developed synthesis methods for linear regression models with different categorization schemes of covariates. A 2‐step approach to aggregate the regression coefficient estimates is proposed. The first step is to estimate the joint distribution of covariates by introducing a latent sampling distribution, which uses one set of individual participant data to estimate the marginal distribution of covariates with categorization. The second step is to use a nonlinear mixed‐effects model with correction terms for the bias due to categorization to estimate the overall regression coefficients. Especially in terms of precision, numerical simulations show that our approach outperforms conventional methods, which only use studies with common covariates or ignore the differences between categorization schemes. The method developed in this study is also applied to a series of WHO epidemiologic studies on white blood cell counts.  相似文献   

9.
A time‐varying latent variable model is proposed to jointly analyze multivariate mixed‐support longitudinal data. The proposal can be viewed as an extension of hidden Markov regression models with fixed covariates (HMRMFCs), which is the state of the art for modelling longitudinal data, with a special focus on the underlying clustering structure. HMRMFCs are inadequate for applications in which a clustering structure can be identified in the distribution of the covariates, as the clustering is independent from the covariates distribution. Here, hidden Markov regression models with random covariates are introduced by explicitly specifying state‐specific distributions for the covariates, with the aim of improving the recovering of the clusters in the data with respect to a fixed covariates paradigm. The hidden Markov regression models with random covariates class is defined focusing on the exponential family, in a generalized linear model framework. Model identifiability conditions are sketched, an expectation‐maximization algorithm is outlined for parameter estimation, and various implementation and operational issues are discussed. Properties of the estimators of the regression coefficients, as well as of the hidden path parameters, are evaluated through simulation experiments and compared with those of HMRMFCs. The method is applied to physical activity data.  相似文献   

10.
Relative survival, a method for assessing prognostic factors for disease-specific mortality in unselected populations, is frequently used in population-based studies. However, most relative survival models assume that the effects of covariates on disease-specific mortality conform with the proportional hazards hypothesis, which may not hold in some long-term studies. To accommodate variation over time of a predictor's effect on disease-specific mortality, we developed a new relative survival regression model using B-splines to model the hazard ratio as a flexible function of time, without having to specify a particular functional form. Our method also allows for testing the hypotheses of hazards proportionality and no association on disease-specific hazard. Accuracy of estimation and inference were evaluated in simulations. The method is illustrated by an analysis of a population-based study of colon cancer.  相似文献   

11.
The Cox proportional hazards model is the most common method to analyse survival data. However, the proportional hazards assumption might not hold. The natural extension of the Cox model is to introduce time-varying effects of the covariates. For some covariates such as (surgical)treatment non-proportionality could be expected beforehand. For some other covariates the non-proportionality only becomes apparent if the follow-up is long enough. It is often observed that all covariates show similar decaying effects over time. Such behaviour could be explained by the popular (gamma-) frailty model. However, the (marginal) effects of covariates in frailty models are not easy to interpret. In this paper we propose the reduced-rank model for time-varying effects of covariates. Starting point is a Cox model with p covariates and time-varying effects modelled by q time functions (constant included), leading to a pxq structure matrix that contains the regression coefficients for all covariate by time function interactions. By reducing the rank of this structure matrix a whole range of models is introduced, from the very flexible full-rank model (identical to a Cox model with time-varying effects) to the very rigid rank one model that mimics the structure of a gamma-frailty model, but is easier to interpret. We illustrate these models with an application to ovarian cancer patients.  相似文献   

12.
Correlation is inherent in longitudinal studies due to the repeated measurements on subjects, as well as due to time-dependent covariates in the study. In the National Longitudinal Study of Adolescent to Adult Health (Add Health), data were repeatedly collected on children in grades 7-12 across four waves. Thus, observations obtained on the same adolescent were correlated, while predictors were correlated with current and future outcomes such as obesity status, among other health issues. Previous methods, such as the generalized method of moments (GMM) approach have been proposed to estimate regression coefficients for time-dependent covariates. However, these approaches combined all valid moment conditions to produce an averaged parameter estimate for each covariate and thus assumed that the effect of each covariate on the response was constant across time. This assumption is not necessarily optimal in applications such as Add Health or health-related data. Thus, we depart from this assumption and instead use the Partitioned GMM approach to estimate multiple coefficients for the data based on different time periods. These extra regression coefficients are obtained using a partitioning of the moment conditions pertaining to each respective relationship. This approach offers a deeper understanding and appreciation into the effect of each covariate on the response. We conduct simulation studies, as well as analyses of obesity in Add Health, rehospitalization in Medicare data, and depression scores in a clinical study. The Partitioned GMM methods exhibit benefits over previously proposed models with improved insight into the nonconstant relationships realized when analyzing longitudinal data.  相似文献   

13.
We performed a Monte Carlo study to evaluate the effect of the number of events per variable (EPV) analyzed in logistic regression analysis. The simulations were based on data from a cardiac trial of 673 patients in which 252 deaths occurred and seven variables were cogent predictors of mortality; the number of events per predictive variable was (252/7 = 36) for the full sample. For the simulations, at values of EPV = 2, 5, 10, 15, 20, and 25, we randomly generated 500 samples of the 673 patients, chosen with replacement, according to a logistic model derived from the full sample. Simulation results for the regression coefficients for each variable in each group of 500 samples were compared for bias, precision, and significance testing against the results of the model fitted to the original sample.

For EPV values of 10 or greater, no major problems occurred. For EPV values less than 10, however, the regression coefficients were biased in both positive and negative directions; the large sample variance estimates from the logistic model both overestimated and underestimated the sample variance of the regression coeffi-cients; the 90% confidence limits about the estimated values did not have proper coverage; the Wald statistic was conservative under the null hypothesis; and paradoxical associations (significance in the wrong direction) were increased. Although other factors (such as the total number of events, or sample size) may influence the validity of the logistic model, our findings indicate that low EPV can lead to major problems.  相似文献   


14.
Multivariate random length data occur when we observe multiple measurements of a quantitative variable and the variable number of these measurements is also an observed outcome for each experimental unit. For example, for a patient with coronary artery disease, we may observe a number of lesions in that patient's coronary arteries, along with percentage of blockage of each lesion. Barnhart and Sampson first proposed the multiple population model to analyse multivariate random length data without covariates. This paper extends their approach to deal with multiple covariates. We propose a new multiple population regression model with covariates, and discuss the estimation issues. We analyse data from the TYPE II coronary intervention study to illustrate the methodology.  相似文献   

15.
When analyzing longitudinal data, it is essential to account both for the correlation inherent from the repeated measures of the responses as well as the correlation realized on account of the feedback created between the responses at a particular time and the predictors at other times. As such one can analyze these data using generalized estimating equation with the independent working correlation. However, because it is essential to include all the appropriate moment conditions as you solve for the regression coefficients, we explore an alternative approach using a generalized method of moments for estimating the coefficients in such data. We develop an approach that makes use of all the valid moment conditions necessary with each time‐dependent and time‐independent covariate. This approach does not assume that feedback is always present over time, or if present occur at the same degree. Further, we make use of continuously updating generalized method of moments in obtaining estimates. We fit the generalized method of moments logistic regression model with time‐dependent covariates using SAS PROC IML and also in R. We used p‐values adjusted for multiple correlated tests to determine the appropriate moment conditions for determining the regression coefficients. We examined two datasets for illustrative purposes. We looked at re‐hospitalization taken from a Medicare database. We also revisited data regarding the relationship between the body mass index and future morbidity among children in the Philippines. We conducted a simulated study to compare the performances of extended classifications. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

16.
The analytical effect of the number of events per variable (EPV) in a proportional hazards regression analysis was evaluated using Monte Carlo simulation techniques for data from a randomized trial containing 673 patients and 252 deaths, in which seven predictor variables had an original significance level of p < 0.10. The 252 deaths and 7 variables correspond to 36 events per variable analyzed in the full data set.

Five hundred simulated analyses were conducted for these seven variables at EPVs of 2, 5, 10, 15, 20, and 25. For each simulation, a random exponential survival time was generated for each of the 673 patients, and the simulated results were compared with their original counterparts. As EPV decreased, the regression coefficients became more biased relative to the true value; the 90% confidence limits about the simulated values did not have a coverage of 90% for the original value; large sample properties did not hold for variance estimates from the proportional hazards model, and the Z statistics used to test the significance of the regression coefficients lost validity under the null hypothesis.

Although a single boundary level for avoiding problems is not easy to choose, the value of EPV = 10 seems most prudent. Below this value for EPV, the results of proportional hazards regression analyses should be interpreted with caution because the statistical model may not be valid.  相似文献   


17.
The least squares estimator of the slope in a simple linear regression model will be biased towards zero when the predictor is measured with random error, i.e. intra-individual variation or technical measurement error. A correction factor can be estimated from a reliability study where one replicate is available on a subset of subjects from the main study. Previous work in this field has assumed that the reliability study constitutes a random subsample from the main study. We propose that a more efficient design is to collect replicates for subjects with extreme values on their first measurement. A variance formula for this estimator of the correction factor is presented. The variance for the corrected estimated regression coefficient for the extreme selection technique is also derived and compared with random subsampling. Results show that variances for corrected regression coefficients can be markedly reduced with extreme selection. The variance gain can be estimated from the main study data. The results are illustrated using Monte Carlo simulations and an application on the relation between insulin sensitivity and fasting insulin using data from the population-based ULSAM study. In conclusion, an investigator faced with the planning of a reliability study may wish to consider an extreme selection design in order to improve precision at a given number of subjects or alternatively decrease the number of subjects at a given precision.  相似文献   

18.
In regression analysis for spatio‐temporal data, identifying clusters of spatial units over time in a regression coefficient could provide insight into the unique relationship between a response and covariates in certain subdomains of space and time windows relative to the background in other parts of the spatial domain and the time period of interest. In this article, we propose a varying coefficient regression method for spatial data repeatedly sampled over time, with heterogeneity in regression coefficients across both space and over time. In particular, we extend a varying coefficient regression model for spatial‐only data to spatio‐temporal data with flexible temporal patterns. We consider the detection of a potential cylindrical cluster of regression coefficients based on testing whether the regression coefficient is the same or not over the entire spatial domain for each time point. For multiple clusters, we develop a sequential identification approach. We assess the power and identification of known clusters via a simulation study. Our proposed methodology is illustrated by the analysis of a cancer mortality dataset in the Southeast of the U.S.  相似文献   

19.
Many meta-analyses use a random-effects model to account for heterogeneity among study results, beyond the variation associated with fixed effects. A random-effects regression approach for the synthesis of 2 × 2 tables allows the inclusion of covariates that may explain heterogeneity. A simulation study found that the random-effects regression method performs well in the context of a meta-analysis of the efficacy of a vaccine for the prevention of tuberculosis, where certain factors are thought to modify vaccine efficacy. A smoothed estimator of the within-study variances produced less bias in the estimated regression coefficients. The method provided very good power for detecting a non-zero intercept term (representing overall treatment efficacy) but low power for detecting a weak covariate in a meta-analysis of 10 studies. We illustrate the model by exploring the relationship between vaccine efficacy and one factor thought to modify efficacy. The model also applies to the meta-analysis of continuous outcomes when covariates are present.  相似文献   

20.
Clinicians and health service researchers are frequently interested in predicting patient-specific probabilities of adverse events (e.g. death, disease recurrence, post-operative complications, hospital readmission). There is an increasing interest in the use of classification and regression trees (CART) for predicting outcomes in clinical studies. We compared the predictive accuracy of logistic regression with that of regression trees for predicting mortality after hospitalization with an acute myocardial infarction (AMI). We also examined the predictive ability of two other types of data-driven models: generalized additive models (GAMs) and multivariate adaptive regression splines (MARS). We used data on 9484 patients admitted to hospital with an AMI in Ontario. We used repeated split-sample validation: the data were randomly divided into derivation and validation samples. Predictive models were estimated using the derivation sample and the predictive accuracy of the resultant model was assessed using the area under the receiver operating characteristic (ROC) curve in the validation sample. This process was repeated 1000 times-the initial data set was randomly divided into derivation and validation samples 1000 times, and the predictive accuracy of each method was assessed each time. The mean ROC curve area for the regression tree models in the 1000 derivation samples was 0.762, while the mean ROC curve area of a simple logistic regression model was 0.845. The mean ROC curve areas for the other methods ranged from a low of 0.831 to a high of 0.851. Our study shows that regression trees do not perform as well as logistic regression for predicting mortality following AMI. However, the logistic regression model had performance comparable to that of more flexible, data-driven models such as GAMs and MARS.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号