首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A natural way of modelling relative survival through regression analysis is to assume an additive form between the expected population hazard and the excess hazard due to the presence of an additional cause of mortality. Within this context, the existing approaches in the parametric, semiparametric and non-parametric setting are compared and discussed. We study the additive excess hazards models, where the excess hazard is on additive form. This makes it possible to assess the importance of time-varying effects for regression models in the relative survival framework. We show how recent developments can be used to make inferential statements about the non-parametric version of the model. This makes it possible to test the key hypothesis that an excess risk effect is time varying in contrast to being constant over time. In case some covariate effects are constant, we show how the semiparametric additive risk model can be considered in the excess risk setting, providing a better and more useful summary of the data. Estimators have explicit form and inference based on a resampling scheme is presented for both the non-parametric and semiparametric models. We also describe a new suggestion for goodness of fit of relative survival models, which consists on statistical and graphical tests based on cumulative martingale residuals. This is illustrated on the semiparametric model with proportional excess hazards. We analyze data from the TRACE study using different approaches and show the need for more flexible models in relative survival.  相似文献   

2.
Clinicians and health service researchers are frequently interested in predicting patient-specific probabilities of adverse events (e.g. death, disease recurrence, post-operative complications, hospital readmission). There is an increasing interest in the use of classification and regression trees (CART) for predicting outcomes in clinical studies. We compared the predictive accuracy of logistic regression with that of regression trees for predicting mortality after hospitalization with an acute myocardial infarction (AMI). We also examined the predictive ability of two other types of data-driven models: generalized additive models (GAMs) and multivariate adaptive regression splines (MARS). We used data on 9484 patients admitted to hospital with an AMI in Ontario. We used repeated split-sample validation: the data were randomly divided into derivation and validation samples. Predictive models were estimated using the derivation sample and the predictive accuracy of the resultant model was assessed using the area under the receiver operating characteristic (ROC) curve in the validation sample. This process was repeated 1000 times-the initial data set was randomly divided into derivation and validation samples 1000 times, and the predictive accuracy of each method was assessed each time. The mean ROC curve area for the regression tree models in the 1000 derivation samples was 0.762, while the mean ROC curve area of a simple logistic regression model was 0.845. The mean ROC curve areas for the other methods ranged from a low of 0.831 to a high of 0.851. Our study shows that regression trees do not perform as well as logistic regression for predicting mortality following AMI. However, the logistic regression model had performance comparable to that of more flexible, data-driven models such as GAMs and MARS.  相似文献   

3.
Statistical methods for identifying harmful chemicals in a correlated mixture often assume linearity in exposure-response relationships. Nonmonotonic relationships are increasingly recognized (eg, for endocrine-disrupting chemicals); however, the impact of nonmonotonicity on exposure selection has not been evaluated. In a simulation study, we assessed the performance of Bayesian kernel machine regression (BKMR), Bayesian additive regression trees (BART), Bayesian structured additive regression with spike-slab priors (BSTARSS), generalized additive models with double penalty (GAMDP) and thin plate shrinkage smoothers (GAMTS), multivariate adaptive regression splines (MARS), and lasso penalized regression. We simulated realistic exposure data based on pregnancy exposure to 17 phthalates and phenols in the US National Health and Nutrition Examination Survey using a multivariate copula. We simulated data sets of size N = 250 and compared methods across 32 scenarios, varying by model size and sparsity, signal-to-noise ratio, correlation structure, and exposure-response relationship shapes. We compared methods in terms of their sensitivity, specificity, and estimation accuracy. In most scenarios, BKMR, BSTARSS, GAMDP, and GAMTS achieved moderate to high sensitivity (0.52-0.98) and specificity (0.21-0.99). BART and MARS achieved high specificity (≥0.90), but low sensitivity in low signal-to-noise ratio scenarios (0.20-0.51). Lasso was highly sensitive (0.71-0.99), except for quadratic relationships (≤0.27). Penalized regression methods that assume linearity, such as lasso, may not be suitable for studies of environmental chemicals hypothesized to have nonmonotonic relationships with outcomes. Instead, BKMR, BSTARSS, GAMDP, and GAMTS are attractive methods for flexibly estimating the shapes of exposure-response relationships and selecting among correlated exposures.  相似文献   

4.
Ghosh D 《Statistics in medicine》2006,25(11):1872-1884
There has been some recent work in the statistical literature for modelling the relationship between tumour biology properties and tumour progression in screening trials. While non-parametric methods have been proposed for estimation of the tumour size distribution at which metastatic transition occurs, their asymptotic properties have not been studied. In addition, no testing or regression methods are available so that potential confounders and prognostic factors can be adjusted for. We develop a unified approach to non-parametric and semi-parametric analysis of modelling tumour size-metastasis data in this article. An association between the models considered by previous authors with survival data structures is discussed. Based on this relationship, we develop non-parametric testing procedures and semi-parametric regression methodology of modelling the effect of size of tumour on the probability at which metastatic transitions occur in two situations. Asymptotic properties of these estimators are provided. The proposed methodology is applied to data from a screening study in lung cancer.  相似文献   

5.
QSAR models for analogs of antiplasmodial artemisinin compounds were established, based on atomic net charges by using multivariate adaptive regression splines (MARS) in comparison with some other methods such as multiple linear regression, alternating conditional expectations and projection pursuit regression. The established models were then evaluated by an Anova decomposition procedure so that the effects of each predictor (additive or interaction) could be viewed graphically, facilitating the interpretation of the underlying relationship. It was found that the QSARs derived from the MARS method are the most satisfactory predictive models, and that the artemisinin pharmacophore identification is in agreement with previous experimental findings.  相似文献   

6.
In many biomedical studies, interest is often attached to calculating effect measures in the presence of interactions between two continuous exposures. Traditional approaches based on parametric regression are limited by the degree of arbitrariness involved in transforming these exposures into categorical variables or imposing a parametric form on the regression function. In this paper, we present: (a) a flexible non-parametric method for estimating effect measures through generalized additive models including interactions; and (b) bootstrap techniques for (i) testing the significance of interaction terms, and (ii) constructing confidence intervals for effect measures. The validity of our methodology is supported by simulations, and illustrated using data from a study of possible risk factors for post-operative infection. This application revealed a hitherto unreported effect: for patients with high plasma glucose levels, increased risk is associated, not only with low, but also with high percentages of lymphocytes.  相似文献   

7.
Tree-structured survival analysis (TSSA) is a popular alternative to the Cox proportional hazards regression in medical research of survival data. Several methods for constructing a tree of different survival profiles have been developed, including TSSA based on log-rank statistics, martingale residuals, Lp Wasserstein metrics between Kaplan-Meier survival curves, and a method based on a weighted average of the within-node impurity of the death indicator and the within-node loss function of follow-up times. Lu and others used variance of restricted mean lifetimes as an index of degree of separation (DOS) to measure the efficiency in separations of survival profiles by a classification method. Like tree-based regression analysis that uses variance as a criterion for node partition and pruning, the variance of restricted mean lifetimes between different groups can be an alternative index to log-rank test statistics in construction of survival trees. In this article, the authors explore the use of DOS in TSSA. They propose an algorithm similar to the least square regression tree for survival analysis based on the variance of the restricted mean lifetimes. They apply the proposed method to prospective cohort data from the Study of Osteoporotic Fracture that motivated the research and then compare their classification rule to those rules based on the conventional TSSA mentioned above. A limited simulation study suggests that the proposed algorithm is a competitive alternative to the log-rank or martingale residual-based TSSA approaches.  相似文献   

8.
A regression method that utilizes an additive model is proposed for the estimation of attributable risk in case-control studies carried out in defined populations. In contrast to previous multivariate procedures for the estimation of attributable risk, which have utilized logistic regression techniques to adjust for confounding factors, the model assumes an additive relation between the covariates included in the regression equation. As an empirical example, additive and logistic models were fitted to matched case-control data from a population-based study of childhood astrocytoma brain tumors. Although both models fitted the data well, the additive model provided a more satisfactory estimate of the risk attributable to multiple exposures, in the absence of significant additive interaction. In contrast to the results from the logistic model, the adjusted estimates of the risk attributable to each factor included in the additive model summed to the overall estimate for all of the factors considered jointly. Thus, the additive approach provides a useful alternative to existing procedures for the multivariate estimation of attributable risk when the additive model is determined to be appropriate on the basis of goodness-of-fit.  相似文献   

9.
In the biology of complex disorders, such as atherothrombosis, interactions among genetic factors may play an important role, and theoretical considerations suggest that gene-gene interactions are quite common in such diseases. We used a nested case-control sample from the Physicians' Health Study, a randomized trial assessing the effects of aspirin and beta-carotene on cardiovascular disease and cancer among 22071 US male physicians, to examine these relationships for ischemic stroke. Data were available on 92 polymorphisms from 56 candidate genes related to inflammation, thrombosis and lipid metabolism, assessed in 319 incident cases of ischemic stroke and 2090 disease-free controls. We used classification and regression trees (CART) and multivariate adaptive regression spline (MARS) models to explore the presence of genetic interactions in these data. These models offer advantages over typical logistic regression methods in that they may uncover interactions among genes that do not exhibit strong marginal effects. Final models were selected using either the Bayes Information Criterion or cross-validation. Model fit was assessed using 10-fold cross-validation of the entire selection process. Both the CART and two-way MARS-logit models identified an interaction between two polymorphisms linked to inflammation, the P-selectin (val640leu) and interleukin-4 (C(582) T) genes. Internal validation of these models, however, suggested that effects of these polymorphisms are additive. Although further external validation of these models is necessary, these methods may be valuable in exploring and identifying potential gene-gene as well as gene-environment interactions in association studies.  相似文献   

10.
BACKGROUND: The traditional method of analysing continuous or ordinal risk factors by categorization or linear models may be improved. METHODS: We propose an approach based on transformation and fractional polynomials which yields simple regression models with interpretable curves. We suggest a way of presenting the results from such models which involves tabulating the risks estimated from the model at convenient values of the risk factor. We discuss how to incorporate several continuous risk and confounding variables within a single model. The approach is exemplified with data from the Whitehall I study of British Civil Servants. We discuss the approach in relation to categorization and non-parametric regression models. RESULTS: We show that non-linear risk models fit the data better than linear models. We discuss the difficulties introduced by categorization and the advantages of the new approach. CONCLUSIONS: Our approach based on fractional polynomials should be considered as an important alternative to the traditional approaches for the analysis of continuous variables in epidemiological studies.  相似文献   

11.
The objective of this study was to model the age-time-dependent incidence of hepatitis B while estimating the impact of vaccination. While stochastic models/time-series have been used before to model hepatitis B cases in the absence of knowledge on the number of susceptibles, this paper proposed using a method that fits into the generalized additive model framework. Generalized additive models with penalized regression splines are used to exploit the underlying continuity of both age and time in a flexible non-parametric way. Based on a unique case notification dataset, we have shown that the implemented immunization programme in Bulgaria resulted in a significant decrease in incidence for infants in their first year of life with 82% (79-84%). Moreover, we have shown that conditional on an assumed baseline susceptibility percentage, a smooth force-of-infection profile can be obtained from which two local maxima were observed at ages 9 and 24 years.  相似文献   

12.
13.
Multi‐state models of chronic disease are becoming increasingly important in medical research to describe the progression of complicated diseases. However, studies seldom observe health outcomes over long time periods. Therefore, current clinical research focuses on the secondary data analysis of the published literature to estimate a single transition probability within the entire model. Unfortunately, there are many difficulties when using secondary data, especially since the states and transitions of published studies may not be consistent with the proposed multi‐state model. Early approaches to reconciling published studies with the theoretical framework of a multi‐state model have been limited to data available as cumulative counts of progression. This paper presents an approach that allows the use of published regression data in a multi‐state model when the published study may have ignored intermediary states in the multi‐state model. Colloquially, we call this approach the Lemonade Method since when study data give you lemons, make lemonade. The approach uses maximum likelihood estimation. An example is provided for the progression of heart disease in people with diabetes. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

14.
Zhang Z  Sun J  Sun L 《Statistics in medicine》2005,24(9):1399-1407
Current status data arise when each study subject is observed only once and the survival time of interest is known only to be either less or greater than the observation time. Such data often occur in, for example, cross-sectional studies, demographical investigations and tumorigenicity experiments and several semi-parametric and non-parametric methods for their analysis have been proposed. However, most of these methods deal only with the situation where observation time is independent of the underlying survival time completely or given covariates. This paper discusses regression analysis of current status data when the observation time may be related to the underlying survival time and inference procedures are presented for estimation of regression parameters under the additive hazards regression model. The procedures can be easily implemented and are applied to two motivating examples.  相似文献   

15.
Flexible regression models with cubic splines   总被引:31,自引:0,他引:31  
We describe the use of cubic splines in regression models to represent the relationship between the response variable and a vector of covariates. This simple method can help prevent the problems that result from inappropriate linearity assumptions. We compare restricted cubic spline regression to non-parametric procedures for characterizing the relationship between age and survival in the Stanford Heart Transplant data. We also provide an illustrative example in cancer therapeutics.  相似文献   

16.
Epidemiologists have used the term ‘tracking’ to connote an individual's maintenance of relative rank of some longitudinally measured characteristic over a given time span. To assess the extent to which an attribute tracks we have first to summarize individual growth curves, and second to quantify the notion of maintenance of relative rank, both in the face of random error. A sequence of papers appearing in 1981 provided differing methodologies for appraising tracking. Here we take a different approach to tracking by using regression trees for longitudinal data. The above two concerns are simultaneously addressed in that the procedure identifies subgroups. defined in terms of covariates, within which the collection of growth curves is homogeneous. After reviewing the existing approaches to tracking we describe the tree-structured methodology, and present an illustrative example pertaining to lung function growth in children.  相似文献   

17.
This paper gives further developments of a non-parametric linear regression model in survival analysis. Three subjects are studied. First, martingale residuals, originally developed for the Cox model, are introduced for our linear model. Their theory is developed and they are shown to be useful for judging goodness of fit. The second focus of the paper is on the use of bootstrap replications to judge which features of the cumulative regression plots are likely to reflect real phenomena and not merely random variation. In particular, this is applied to judging whether the effect of a covariate disappears over time, a problem for which no formal test exists. The third subject is density type, or kernel, estimation of the regression functions themselves. This might give more direct information than the cumulative plots. The approaches are illustrated by data from a clinical trial of carcinoma of the oropharynx, and by survival times of grafts in renal patients.  相似文献   

18.
Bayesian additive regression trees (BART) provide a framework for flexible nonparametric modeling of relationships of covariates to outcomes. Recently, BART models have been shown to provide excellent predictive performance, for both continuous and binary outcomes, and exceeding that of its competitors. Software is also readily available for such outcomes. In this article, we introduce modeling that extends the usefulness of BART in medical applications by addressing needs arising in survival analysis. Simulation studies of one‐sample and two‐sample scenarios, in comparison with long‐standing traditional methods, establish face validity of the new approach. We then demonstrate the model's ability to accommodate data from complex regression models with a simulation study of a nonproportional hazards scenario with crossing survival functions and survival function estimation in a scenario where hazards are multiplicatively modified by a highly nonlinear function of the covariates. Using data from a recently published study of patients undergoing hematopoietic stem cell transplantation, we illustrate the use and some advantages of the proposed method in medical investigations. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

19.
Recurrent event data are commonly encountered in health-related longitudinal studies. In this paper time-to-events models for recurrent event data are studied with non-informative and informative censorings. In statistical literature, the risk set methods have been confirmed to serve as an appropriate and efficient approach for analysing recurrent event data when censoring is non-informative. This approach produces biased results, however, when censoring is informative for the time-to-events outcome data. We compare the risk set methods with alternative non-parametric approaches which are robust subject to informative censoring. In particular, non-parametric procedures for the estimation of the cumulative occurrence rate function (CORF) and the occurrence rate function (ORF) are discussed in detail. Simulation and an analysis of data from the AIDS Link to Intravenous Experiences Cohort Study is presented.  相似文献   

20.
In linear mixed models the influence of covariates is restricted to a strictly parametric form. With the rise of semi- and non-parametric regression also the mixed model has been expanded to allow for additive predictors. The common approach uses the representation of additive models as mixed models. An alternative approach that is proposed in the present paper is likelihood based boosting. Boosting originates in the machine learning community where it has been proposed as a technique to improve classification procedures by combining estimates with reweighted observations. Likelihood based boosting is a general method which may be seen as an extension of L2 boost. In additive mixed models the advantage of boosting techniques in the form of componentwise boosting is that it is suitable for high dimensional settings where many explanatory variables are present. It allows to fit additive models for many covariates with implicit selection of relevant variables and automatic selection of smoothing parameters. Moreover, boosting techniques may be used to incorporate the subject-specific variation of smooth influence functions by specifying 'random slopes' on smooth effects. This results in flexible semiparametric mixed models which are appropriate in cases where a simple random intercept is unable to capture the variation of effects across subjects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号