首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Tutz G  Binder H 《Statistics in medicine》2004,23(15):2445-2461
Discrete survival models have been extended in several ways. More flexible models are obtained by including time-varying coefficients and covariates which determine the hazard rate in an additive but not further specified form. In this paper, a general model is considered which comprises both types of covariate effects. An additional extension is the incorporation of smooth interaction between time and covariates. Thus, in the linear predictor smooth effects of covariates which may vary across time are allowed. It is shown how simple duration models produce artefacts which may be avoided by flexible models. For the general model which includes parametric terms, time-varying coefficients in parametric terms and time-varying smooth effects estimation procedures are derived which are based on the regularized expansion of smooth effects in basis functions. The approach is used to model the sojourn time in a psychiatric hospital. It is demonstrated how initial conditions which have non-linear influence are damped over time.  相似文献   

2.
Leng C  Ma S 《Statistics in medicine》2007,26(20):3753-3770
As a flexible alternative to the Cox model, the additive risk model assumes that the hazard function is the sum of the baseline hazard and a regression function of covariates. For right censored survival data when variable selection is needed along with model estimation, we propose a path consistent model selector using a modified Lasso approach, under the additive risk model assumption. We show that the proposed estimator possesses the oracle variable selection and estimation property. Applications of the proposed approach to three right censored survival data sets show that the proposed modified Lasso yields parsimonious models with satisfactory estimation and prediction results.  相似文献   

3.
In many statistical regression and prediction problems, it is reasonable to assume monotone relationships between certain predictor variables and the outcome. Genomic effects on phenotypes are, for instance, often assumed to be monotone. However, in some settings, it may be reasonable to assume a partially linear model, where some of the covariates can be assumed to have a linear effect. One example is a prediction model using both high-dimensional gene expression data, and low-dimensional clinical data, or when combining continuous and categorical covariates. We study methods for fitting the partially linear monotone model, where some covariates are assumed to have a linear effect on the response, and some are assumed to have a monotone (potentially nonlinear) effect. Most existing methods in the literature for fitting such models are subject to the limitation that they have to be provided the monotonicity directions a priori for the different monotone effects. We here present methods for fitting partially linear monotone models which perform both automatic variable selection, and monotonicity direction discovery. The proposed methods perform comparably to, or better than, existing methods, in terms of estimation, prediction, and variable selection performance, in simulation experiments in both classical and high-dimensional data settings.  相似文献   

4.
Yang Y  Kang J  Mao K  Zhang J 《Statistics in medicine》2007,26(20):3782-3800
In this article we develop flexible regression models in two respects to evaluate the influence of the covariate variables on the mixed Poisson and continuous responses and to evaluate how the correlation between Poisson response and continuous response changes over time. A scenario for dealing with regression models of mixed continuous and Poisson responses when the heterogeneous variance and correlation changing over time exist is proposed. Our general approach is first to jointly build marginal model and to check whether the variance and correlation change over time via likelihood ratio test. If the variance and correlation change over time, we will do a suitable data transformation to properly evaluate the influence of the covariates on the mixed responses. The proposed methods are applied to the interstitial cystitis data base (ICDB) cohort study, and we find that the positive correlations significantly change over time, which suggests heterogeneous variances should not be ignored in modelling and inference.  相似文献   

5.
In most epidemiological investigations, the study units are people, the outcome variable (or the response) is a health‐related event, and the explanatory variables are usually environmental and/or socio‐demographic factors. The fundamental task in such investigations is to quantify the association between the explanatory variables (covariates/exposures) and the outcome variable through a suitable regression model. The accuracy of such quantification depends on how precisely the relevant covariates are measured. In many instances, we cannot measure some of the covariates accurately. Rather, we can measure noisy (mismeasured) versions of them. In statistical terminology, mismeasurement in continuous covariates is known as measurement errors or errors‐in‐variables. Regression analyses based on mismeasured covariates lead to biased inference about the true underlying response–covariate associations. In this paper, we suggest a flexible parametric approach for avoiding this bias when estimating the response–covariate relationship through a logistic regression model. More specifically, we consider the flexible generalized skew‐normal and the flexible generalized skew‐t distributions for modeling the unobserved true exposure. For inference and computational purposes, we use Bayesian Markov chain Monte Carlo techniques. We investigate the performance of the proposed flexible parametric approach in comparison with a common flexible parametric approach through extensive simulation studies. We also compare the proposed method with the competing flexible parametric method on a real‐life data set. Though emphasis is put on the logistic regression model, the proposed method is unified and is applicable to the other generalized linear models, and to other types of non‐linear regression models as well. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

6.
In studies using ecological momentary assessment (EMA), or other intensive longitudinal data collection methods, interest frequently centers on changes in the variances, both within‐subjects and between‐subjects. For this, Hedeker et al. (Biometrics 2008; 64: 627–634) developed an extended two‐level mixed‐effects model that treats observations as being nested within subjects and allows covariates to influence both the within‐subjects and between‐subjects variance, beyond their influence on means. However, in EMA studies, subjects often provide many responses within and across days. To account for the possible systematic day‐to‐day variation, we developed a more flexible three‐level mixed‐effects location scale model that treats observations within days within subjects, and allows covariates to influence the variance at the subject, day, and observation level (over and above their usual effects on means) using a log‐linear representation throughout. We provide details of a maximum likelihood solution and demonstrate how SAS PROC NLMIXED can be used to achieve maximum likelihood estimates in an alternative parameterization of our proposed three‐level model. The accuracy of this approach using NLMIXED was verified by a series of simulation studies. Data from an adolescent mood study using EMA were analyzed to demonstrate this approach. The analyses clearly show the benefit of the proposed three‐level model over the existing two‐level approach. The proposed model has useful applications in many studies with three‐level structures where interest centers on the joint modeling of the mean and variance structure. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

7.
Motivated by high‐throughput profiling studies in biomedical research, variable selection methods have been a focus for biostatisticians. In this paper, we consider semiparametric varying‐coefficient accelerated failure time models for right censored survival data with high‐dimensional covariates. Instead of adopting the traditional regularization approaches, we offer a novel sparse boosting (SparseL2Boosting) algorithm to conduct model‐based prediction and variable selection. One main advantage of this new method is that we do not need to perform the time‐consuming selection of tuning parameters. Extensive simulations are conducted to examine the performance of our sparse boosting feature selection techniques. We further illustrate our methods using a lung cancer data analysis.  相似文献   

8.
In developing regression models, data analysts are often faced with many predictor variables that may influence an outcome variable. After more than half a century of research, the 'best' way of selecting a multivariable model is still unresolved. It is generally agreed that subject matter knowledge, when available, should guide model building. However, such knowledge is often limited, and data-dependent model building is required. We limit the scope of the modelling exercise to selecting important predictors and choosing interpretable and transportable functions for continuous predictors. Assuming linear functions, stepwise selection and all-subset strategies are discussed; the key tuning parameters are the nominal P-value for testing a variable for inclusion and the penalty for model complexity, respectively. We argue that stepwise procedures perform better than a literature-based assessment would suggest.Concerning selection of functional form for continuous predictors, the principal competitors are fractional polynomial functions and various types of spline techniques. We note that a rigorous selection strategy known as multivariable fractional polynomials (MFP) has been developed. No spline-based procedure for simultaneously selecting variables and functional forms has found wide acceptance. Results of FP and spline modelling are compared in two data sets. It is shown that spline modelling, while extremely flexible, can generate fitted curves with uninterpretable 'wiggles', particularly when automatic methods for choosing the smoothness are employed. We give general recommendations to practitioners for carrying out variable and function selection. While acknowledging that further research is needed, we argue why MFP is our preferred approach for multivariable model building with continuous covariates.  相似文献   

9.
The usage of the Aalen additive approach is proposed to model cost data. Using a Monte Carlo simulation, in a wide set of scenarios, we showed that the Aalen model is performing well and can be a reasonable alternative to the standard Gamma regression models. In addition, with reference to the COSTAMI trial data, we highlighted the ability of the Aalen model to offer additional information about the relationships between costs and specific covariates, as compared with standard regression techniques.  相似文献   

10.
We develop methodology for causal inference in observational studies when using propensity score subclassification on data constructed with probabilistic record linkage techniques. We focus on scenarios where covariates and binary treatment assignments are in one file and outcomes are in another file, and the goal is to estimate an additive treatment effect by merging the files. We assume that the files can be linked using variables common to both files, eg, names or birth dates, but that links are subject to errors, eg, due to reporting errors in the linking variables. We develop methodology for cases where such reporting errors are independent of the other variables on the files. We describe conceptually how linkage errors can affect causal estimates in subclassification contexts. We also present and evaluate several algorithms for deciding which record pairs to use in estimation of causal effects. Using simulation studies, we demonstrate that case selection procedures can result in improved accuracy in estimates of treatment effects from linked data compared to using only cases known to be true links.  相似文献   

11.
Extensive baseline covariate information is routinely collected on participants in randomized clinical trials, and it is well recognized that a proper covariate‐adjusted analysis can improve the efficiency of inference on the treatment effect. However, such covariate adjustment has engendered considerable controversy, as post hoc selection of covariates may involve subjectivity and may lead to biased inference, whereas prior specification of the adjustment may exclude important variables from consideration. Accordingly, how to select covariates objectively to gain maximal efficiency is of broad interest. We propose and study the use of modern variable selection methods for this purpose in the context of a semiparametric framework, under which variable selection in modeling the relationship between outcome and covariates is separated from estimation of the treatment effect, circumventing the potential for selection bias associated with standard analysis of covariance methods. We demonstrate that such objective variable selection techniques combined with this framework can identify key variables and lead to unbiased and efficient inference on the treatment effect. A critical issue in finite samples is validity of estimators of uncertainty, such as standard errors and confidence intervals for the treatment effect. We propose an approach to estimation of sampling variation of estimated treatment effect and show its superior performance relative to that of existing methods. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

12.
Goodness-of-fit tests for ordinal response regression models   总被引:1,自引:0,他引:1  
It is well documented that the commonly used Pearson chi-square and deviance statistics are not adequate for assessing goodness-of-fit in logistic regression models when continuous covariates are modelled. In recent years, several methods have been proposed which address this shortcoming in the binary logistic regression setting or assess model fit differently. However, these techniques have typically not been extended to the ordinal response setting and few techniques exist to assess model fit in that case. We present the modified Pearson chi-square and deviance tests that are appropriate for assessing goodness-of-fit in ordinal response models when both categorical and continuous covariates are present. The methods have good power to detect omitted interaction terms and reasonable power to detect failure of the proportional odds assumption or modelling the wrong functional form of a continuous covariate. These tests also provide immediate information as to where a model may not fit well. In addition, the methods are simple to understand and implement, and are non-specific. That is, they do not require prespecification of a type of lack-of-fit to detect.  相似文献   

13.
We study the problem of estimation and inference on the average treatment effect in a smoking cessation trial where an outcome and some auxiliary information were measured longitudinally, and both were subject to missing values. Dynamic generalized linear mixed effects models linking the outcome, the auxiliary information, and the covariates are proposed. The maximum likelihood approach is applied to the estimation and inference on the model parameters. The average treatment effect is estimated by the G‐computation approach, and the sensitivity of the treatment effect estimate to the nonignorable missing data mechanisms is investigated through the local sensitivity analysis approach. The proposed approach can handle missing data that form arbitrary missing patterns over time. We applied the proposed method to the analysis of the smoking cessation trial. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

14.
PURPOSE: This paper introduces an approach that includes non-quantitative factors for the selection and assessment of multivariate complex models in health. METHODS: A goodness-of-fit based methodology combined with fuzzy multi-criteria decision-making approach is proposed for model selection. Models were obtained using the Path Analysis (PA) methodology in order to explain the interrelationship between health determinants and the post-neonatal component of infant mortality in 59 municipalities of Brazil in the year 1991. Socioeconomic and demographic factors were used as exogenous variables, and environmental, health service and agglomeration as endogenous variables. Five PA models were developed and accepted by statistical criteria of goodness-of fit. These models were then submitted to a group of experts, seeking to characterize their preferences, according to predefined criteria that tried to evaluate model relevance and plausibility. Fuzzy set techniques were used to rank the alternative models according to the number of times a model was superior to ("dominated") the others. RESULTS: The best-ranked model explained above 90% of the endogenous variables variation, and showed the favorable influences of income and education levels on post-neonatal mortality. It also showed the unfavorable effect on mortality of fast population growth, through precarious dwelling conditions and decreased access to sanitation. CONCLUSIONS: It was possible to aggregate expert opinions in model evaluation. The proposed procedure for model selection allowed the inclusion of subjective information in a clear and systematic manner.  相似文献   

15.
Estimates of additive interaction from case-control data are often obtained by logistic regression; such models can also be used to adjust for covariates. This approach to estimating additive interaction has come under some criticism because of possible misspecification of the logistic model: If the underlying model is linear, the logistic model will be misspecified. The authors propose an inverse probability of treatment weighting approach to causal effects and additive interaction in case-control studies. Under the assumption of no unmeasured confounding, the approach amounts to fitting a marginal structural linear odds model. The approach allows for the estimation of measures of additive interaction between dichotomous exposures, such as the relative excess risk due to interaction, using case-control data without having to rely on modeling assumptions for the outcome conditional on the exposures and covariates. Rather than using conditional models for the outcome, models are instead specified for the exposures conditional on the covariates. The approach is illustrated by assessing additive interaction between genetic and environmental factors using data from a case-control study.  相似文献   

16.
In this paper, we consider fitting semiparametric additive hazards models for case‐cohort studies using a multiple imputation approach. In a case‐cohort study, main exposure variables are measured only on some selected subjects, but other covariates are often available for the whole cohort. We consider this as a special case of a missing covariate by design. We propose to employ a popular incomplete data method, multiple imputation, for estimation of the regression parameters in additive hazards models. For imputation models, an imputation modeling procedure based on a rejection sampling is developed. A simple imputation modeling that can naturally be applied to a general missing‐at‐random situation is also considered and compared with the rejection sampling method via extensive simulation studies. In addition, a misspecification aspect in imputation modeling is investigated. The proposed procedures are illustrated using a cancer data example. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

17.
Longitudinal studies are often concerned with estimating the recurrence rate of a non-fatal event. In many cases, only the total number of events occurring during successive time intervals is known. We compared a mixed Poisson-gamma regression method proposed by Thall and a quasi-likelihood method proposed by Zeger and Liang for the analysis of such data, in the case where the mean was correctly specified, using simulation techniques with large samples. Both methods produced similar standard errors in most situations, except in the case of time-dependent covariates with non-Poisson-gamma data where they were seriously underestimated by the Thall method. A simple method for discriminating between the variance forms of the two methods is described. The findings are applied to the analyses of clinical trials of non-melanoma skin cancer and familial polyposis. This study extends the findings of Breslow concerning variance misspecification in overdispersed Poisson and quasi-likelihood models to the longitudinal setting.  相似文献   

18.
A regression method that utilizes an additive model is proposed for the estimation of attributable risk in case-control studies carried out in defined populations. In contrast to previous multivariate procedures for the estimation of attributable risk, which have utilized logistic regression techniques to adjust for confounding factors, the model assumes an additive relation between the covariates included in the regression equation. As an empirical example, additive and logistic models were fitted to matched case-control data from a population-based study of childhood astrocytoma brain tumors. Although both models fitted the data well, the additive model provided a more satisfactory estimate of the risk attributable to multiple exposures, in the absence of significant additive interaction. In contrast to the results from the logistic model, the adjusted estimates of the risk attributable to each factor included in the additive model summed to the overall estimate for all of the factors considered jointly. Thus, the additive approach provides a useful alternative to existing procedures for the multivariate estimation of attributable risk when the additive model is determined to be appropriate on the basis of goodness-of-fit.  相似文献   

19.
In behavioral, biomedical, and social‐psychological sciences, it is common to encounter latent variables and heterogeneous data. Mixture structural equation models (SEMs) are very useful methods to analyze these kinds of data. Moreover, the presence of missing data, including both missing responses and missing covariates, is an important issue in practical research. However, limited work has been done on the analysis of mixture SEMs with non‐ignorable missing responses and covariates. The main objective of this paper is to develop a Bayesian approach for analyzing mixture SEMs with an unknown number of components, in which a multinomial logit model is introduced to assess the influence of some covariates on the component probability. Results of our simulation study show that the Bayesian estimates obtained by the proposed method are accurate, and the model selection procedure via a modified DIC is useful in identifying the correct number of components and in selecting an appropriate missing mechanism in the proposed mixture SEMs. A real data set related to a longitudinal study of polydrug use is employed to illustrate the methodology. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

20.
We present a novel method for variable selection in regression models when covariates are measured with error. The iterative algorithm we propose, M easurement E rror Boost ing (MEBoost), follows a path defined by estimating equations that correct for covariate measurement error. We illustrate the use of MEBoost in practice by analyzing data from the Box Lunch Study, a clinical trial in nutrition where several variables are based on self-report and, hence, measured with error, where we are interested in performing model selection from a large data set to select variables that are related to the number of times a subject binge ate in the last 28 days. Furthermore, we evaluated our method and compared its performance to the recently proposed Convex Conditioned Lasso and to the “naive” Lasso, which does not correct for measurement error through a simulation study. Increasing the degree of measurement error increased prediction error and decreased the probability of accurate covariate selection, but this loss of accuracy occurred to a lesser degree when using MEBoost. Through simulations, we also make a case for the consistency of the model selected.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号