首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Health status and outcomes are frequently measured on an ordinal scale. For high-throughput genomic datasets, the common approach to analyzing ordinal response data has been to break the problem into one or more dichotomous response analyses. This dichotomous response approach does not make use of all available data and therefore leads to loss of power and increases the number of type I errors. Herein we describe an innovative frequentist approach that combines two statistical techniques, L(1) penalization and continuation ratio models, for modeling an ordinal response using gene expression microarray data. We conducted a simulation study to assess the performance of two computational approaches and two model selection criteria for fitting frequentist L(1) penalized continuation ratio models. Moreover, we empirically compared the approaches using three application datasets, each of which seeks to classify an ordinal class using microarray gene expression data as the predictor variables. We conclude that the L(1) penalized constrained continuation ratio model is a useful approach for modeling an ordinal response for datasets where the number of covariates (p) exceeds the sample size (n) and the decision of whether to use Akaike Information Criterion (AIC) or Bayesian Information Criterion (BIC) for selecting the final model should depend upon the similarities between the pathologies underlying the disease states to be classified.  相似文献   

2.
Lunt M 《Statistics in medicine》2005,24(9):1357-1369
There are a number of regression models which are widely used to predict ordinal outcomes. The commonly used models assume that all predictor variables have a similar effect at all levels of the outcome variable. If this is not the case, for example if some variables predict susceptibility to a disease and others predict the severity of the disease, then a more complex model is required. One possibility is the multinomial logistic regression model, which assumes that the predictor variables have different effects at all levels of the outcome variable. An alternative is to use the stereotype family of regression models. A one-dimensional stereotype model makes the assumption that the effect of each predictor is the same at all outcome levels. However, it is possible to fit stereotype models with more than one dimension, up to a maximum of min(k-1, p) where k is the number of outcome categories and p is the number of predictor variables. A stereotype model of this maximum dimension is equivalent to a multinomial logistic regression model, in that it will produce the same predicted values and log-likelihood. If there are sufficient outcome levels and/or predictor variables, there may be a number of stereotype models of differing dimension.The method is illustrated with an example of prediction of damage to joints in rheumatoid arthritis.  相似文献   

3.
Quality of life has been increasingly emphasized in public health research in recent years. Typically, the results of quality of life are measured by means of ordinal scales. In these situations, specific statistical methods are necessary because procedures such as either dichotomization or misinformation on the distribution of the outcome variable may complicate the inferential process. Ordinal logistic regression models are appropriate in many of these situations. This article presents a review of the proportional odds model, partial proportional odds model, continuation ratio model, and stereotype model. The fit, statistical inference, and comparisons between models are illustrated with data from a study on quality of life in 273 patients with schizophrenia. All tested models showed good fit, but the proportional odds or partial proportional odds models proved to be the best choice due to the nature of the data and ease of interpretation of the results. Ordinal logistic models perform differently depending on categorization of outcome, adequacy in relation to assumptions, goodness-of-fit, and parsimony.  相似文献   

4.
We examine goodness‐of‐fit tests for the proportional odds logistic regression model—the most commonly used regression model for an ordinal response variable. We derive a test statistic based on the Hosmer–Lemeshow test for binary logistic regression. Using a simulation study, we investigate the distribution and power properties of this test and compare these with those of three other goodness‐of‐fit tests. The new test has lower power than the existing tests; however, it was able to detect a greater number of the different types of lack of fit considered in this study. Moreover, the test allows for the results to be summarized in a contingency table of observed and estimated frequencies, which is a useful supplementary tool to assess model fit. We illustrate the ability of the tests to detect lack of fit using a study of aftercare decisions for psychiatrically hospitalized adolescents. The test proposed in this paper is similar to a recently developed goodness‐of‐fit test for multinomial logistic regression. A unified approach for testing goodness of fit is now available for binary, multinomial, and ordinal logistic regression models. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

5.
Classical methods for fitting a varying intercept logistic regression model to stratified data are based on the conditional likelihood principle to eliminate the stratum-specific nuisance parameters. When the outcome variable has multiple ordered categories, a natural choice for the outcome model is a stratified proportional odds or cumulative logit model. However, classical conditioning techniques do not apply to the general K-category cumulative logit model (K>2) with varying stratum-specific intercepts as there is no reduction due to sufficiency; the nuisance parameters remain in the conditional likelihood. We propose a methodology to fit stratified proportional odds model by amalgamating conditional likelihoods obtained from all possible binary collapsings of the ordinal scale. The method allows for categorical and continuous covariates in a general regression framework. We provide a robust sandwich estimate of the variance of the proposed estimator. For binary exposures, we show equivalence of our approach to the estimators already proposed in the literature. The proposed recipe can be implemented very easily in standard software. We illustrate the methods via three real data examples related to biomedical research. Simulation results comparing the proposed method with a random effects model on the stratification parameters are also furnished.  相似文献   

6.
Ordinal regression models for epidemiologic data   总被引:7,自引:0,他引:7  
Health status is often measured in epidemiologic studies on an ordinal scale, but data of this type are generally reduced for analysis to a single dichotomy. Several statistical models have been developed to make full use of information in ordinal response data, but have not been much used in analyzing epidemiologic studies. The authors discuss two of these statistical models--the cumulative odds model and the continuation ratio model. They may be interpreted in terms of odds ratios, can account for confounding variables, have clear and testable assumptions, and have parameters that may be estimated and hypotheses that may be tested using available statistical packages. However, calculations of asymptotic relative efficiency and results of simulations showed that simple logistic regression applied to dichotomized responses can in some realistic situations have more than 75% of the efficiency of ordinal regression models, but only if the ordinal scale is collapsed into a dichotomy close to the optimal point. The application of the proposed models to data from a study of chest x-rays of workers exposed to mineral fibers confirmed that they are easy to use and interpret, but gave results quite similar to those obtained using simple logistic regression after dichotomizing outcome in the conventional way.  相似文献   

7.
In multilocus association analysis, since some markers may not be associated with a trait, it seems attractive to use penalized regression with the capability of automatic variable selection. On the other hand, in spite of a rapidly growing body of literature on penalized regression, most focus on variable selection and outcome prediction, for which penalized methods are generally more effective than their nonpenalized counterparts. However, for statistical inference, i.e. hypothesis testing and interval estimation, it is less clear how penalized methods would perform, or even how to best apply them, largely due to lack of studies on this topic. In our motivating data for a cohort of kidney transplant recipients, it is of primary interest to assess whether a group of genetic variants are associated with a binary clinical outcome, acute rejection at 6 months. In this article, we study some technical issues and alternative implementations of hypothesis testing in Lasso penalized logistic regression, and compare their performance with each other and with several existing global tests, some of which are specifically designed as variance component tests for high-dimensional data. The most interesting, and perhaps surprising, conclusion of this study is that, for low to moderately high-dimensional data, statistical tests based on Lasso penalized regression are not necessarily more powerful than some existing global tests. In addition, in penalized regression, rather than building a test based on a single selected "best" model, combining multiple tests, each of which is built on a candidate model, might be more promising.  相似文献   

8.
In matched case‐crossover studies, it is generally accepted that the covariates on which a case and associated controls are matched cannot exert a confounding effect on independent predictors included in the conditional logistic regression model. This is because any stratum effect is removed by the conditioning on the fixed number of sets of the case and controls in the stratum. Hence, the conditional logistic regression model is not able to detect any effects associated with the matching covariates by stratum. However, some matching covariates such as time often play an important role as an effect modification leading to incorrect statistical estimation and prediction. Therefore, we propose three approaches to evaluate effect modification by time. The first is a parametric approach, the second is a semiparametric penalized approach, and the third is a semiparametric Bayesian approach. Our parametric approach is a two‐stage method, which uses conditional logistic regression in the first stage and then estimates polynomial regression in the second stage. Our semiparametric penalized and Bayesian approaches are one‐stage approaches developed by using regression splines. Our semiparametric one stage approach allows us to not only detect the parametric relationship between the predictor and binary outcomes, but also evaluate nonparametric relationships between the predictor and time. We demonstrate the advantage of our semiparametric one‐stage approaches using both a simulation study and an epidemiological example of a 1‐4 bi‐directional case‐crossover study of childhood aseptic meningitis with drinking water turbidity. We also provide statistical inference for the semiparametric Bayesian approach using Bayes Factors. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

9.
Penalized regression methods offer an attractive alternative to single marker testing in genetic association analysis. Penalized regression methods shrink down to zero the coefficient of markers that have little apparent effect on the trait of interest, resulting in a parsimonious subset of what we hope are true pertinent predictors. Here we explore the performance of penalization in selecting SNPs as predictors in genetic association studies. The strength of the penalty can be chosen either to select a good predictive model (via methods such as computationally expensive cross validation), through maximum likelihood-based model selection criterion (such as the BIC), or to select a model that controls for type I error, as done here. We have investigated the performance of several penalized logistic regression approaches, simulating data under a variety of disease locus effect size and linkage disequilibrium patterns. We compared several penalties, including the elastic net, ridge, Lasso, MCP and the normal-exponential-γ shrinkage prior implemented in the hyperlasso software, to standard single locus analysis and simple forward stepwise regression. We examined how markers enter the model as penalties and P-value thresholds are varied, and report the sensitivity and specificity of each of the methods. Results show that penalized methods outperform single marker analysis, with the main difference being that penalized methods allow the simultaneous inclusion of a number of markers, and generally do not allow correlated variables to enter the model, producing a sparse model in which most of the identified explanatory markers are accounted for.  相似文献   

10.
目的 比较L1正则化、L2正则化和弹性网三种惩罚logistic回归对SNPs数据的变量筛选能力。 方法 根据所设置的参数生成不同条件的SNPs仿真数据,利用正确率、错误率和正确指数从三个方面评价三种惩罚logistic回归的变量筛选能力。 结果 正确率表现为L2正则化惩罚logistic回归>弹性网惩罚logistic回归>L1正则化惩罚logistic回归;错误率表现为L2正则化惩罚logistic回归>弹性网惩罚logistic回归>L1正则化惩罚logistic回归;正确指数则表现为弹性网惩罚logistic回归>L1正则化惩罚logistic回归>L2正则化惩罚logistic回归。 结论 综合来看弹性网的筛选能力更优,弹性网融合L1、L2两种正则化的思想,在高维数据分析中既能保证模型的稀疏性,便于结果的解释,又解决了具有相关性自变量不能同时进入模型的问题。  相似文献   

11.
12.
We provide a simple and practical, yet flexible, penalized estimation method for a Cox proportional hazards model with current status data. We approximate the baseline cumulative hazard function by monotone B‐splines and use a hybrid approach based on the Fisher‐scoring algorithm and the isotonic regression to compute the penalized estimates. We show that the penalized estimator of the nonparametric component achieves the optimal rate of convergence under some smooth conditions and that the estimators of the regression parameters are asymptotically normal and efficient. Moreover, a simple variance estimation method is considered for inference on the regression parameters. We perform 2 extensive Monte Carlo studies to evaluate the finite‐sample performance of the penalized approach and compare it with the 3 competing R packages: C1.coxph, intcox, and ICsurv. A goodness‐of‐fit test and model diagnostics are also discussed. The methodology is illustrated with 2 real applications.  相似文献   

13.
Goodness-of-fit tests for ordinal response regression models   总被引:1,自引:0,他引:1  
It is well documented that the commonly used Pearson chi-square and deviance statistics are not adequate for assessing goodness-of-fit in logistic regression models when continuous covariates are modelled. In recent years, several methods have been proposed which address this shortcoming in the binary logistic regression setting or assess model fit differently. However, these techniques have typically not been extended to the ordinal response setting and few techniques exist to assess model fit in that case. We present the modified Pearson chi-square and deviance tests that are appropriate for assessing goodness-of-fit in ordinal response models when both categorical and continuous covariates are present. The methods have good power to detect omitted interaction terms and reasonable power to detect failure of the proportional odds assumption or modelling the wrong functional form of a continuous covariate. These tests also provide immediate information as to where a model may not fit well. In addition, the methods are simple to understand and implement, and are non-specific. That is, they do not require prespecification of a type of lack-of-fit to detect.  相似文献   

14.
The propensity adjustment is used to reduce bias in treatment effectiveness estimates from observational data. We show here that a mixed-effects implementation of the propensity adjustment can reduce bias in longitudinal studies of non-equivalent comparison groups. The strategy examined here involves two stages. Initially, a mixed-effects ordinal logistic regression model of propensity for treatment intensity includes variables that differentiate subjects who receive various doses of time-varying treatments. Second, a mixed-effects linear regression model compares the effectiveness of those ordinal doses on a continuous outcome over time. Here, a simulation study compares bias reduction that is achieved by implementing this propensity adjustment through various forms of stratification. The simulations demonstrate that bias decreased monotonically as the number of quantiles used for stratification increased from two to five. This was particularly pronounced with stronger effects of the confounding variables. The quartile and quintile strategies typically removed in excess of 80-90 per cent of the bias detected in unadjusted models; whereas a median-split approach removed from 20 to 45 per cent of bias. The approach is illustrated in an evaluation of the effectiveness of somatic treatments for major depression in a longitudinal, observational study of affective disorders.  相似文献   

15.
In many medical studies, researchers widely use composite or long ordinal scores, that is, scores that have a large number of categories and a natural ordering often resulting from the sum of a number of short ordinal scores, to assess function or quality of life. Typically, we analyse these using unjustified assumptions of normality for the outcome measure, which are unlikely to be even approximately true. Scores of this type are better analysed using methods reserved for more conventional (short) ordinal scores, such as the proportional‐odds model. We can avoid the need for a large number of cut‐point parameters that define the divisions between the score categories for long ordinal scores in the proportional‐odds model by the inclusion of orthogonal polynomial contrasts. We introduce the repeated measures proportional‐odds logistic regression model and describe for long ordinal outcomes modifications to the generalized estimating equation methodology used for parameter estimation. We introduce data from a trial assessing two surgical interventions, briefly describe and re‐analyse these using the new model and compare inferences from the new analysis with previously published results for the primary outcome measure (hip function at 12 months postoperatively). We use a simulation study to illustrate how this model also has more general application for conventional short ordinal scores, to select amongst competing models of varying complexity for the cut‐point parameters. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

16.
Ordinal data appear in a wide variety of scientific fields. These data are often analyzed using ordinal logistic regression models that assume proportional odds. When this assumption is not met, it may be possible to capture the lack of proportionality using a constrained structural relationship between the odds and the cut‐points of the ordinal values. We consider a trend odds version of this constrained model, wherein the odds parameter increases or decreases in a monotonic manner across the cut‐points. We demonstrate algebraically and graphically how this model is related to latent logistic, normal, and exponential distributions. In particular, we find that scale changes in these potential latent distributions are consistent with the trend odds assumption, with the logistic and exponential distributions having odds that increase in a linear or nearly linear fashion. We show how to fit this model using SAS Proc NLMIXED and perform simulations under proportional odds and trend odds processes. We find that the added complexity of the trend odds model gives improved power over the proportional odds model when there are moderate to severe departures from proportionality. A hypothetical data set is used to illustrate the interpretation of the trend odds model, and we apply this model to a swine influenza example wherein the proportional odds assumption appears to be violated. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

17.
Analyzing data with clumping at zero. An example demonstration   总被引:1,自引:0,他引:1  
This article demonstrates the use of two approaches to analyzing the relationship of multiple covariates to an outcome which has a high proportion of zero values. One approach is to categorize the continuous outcome (including the zero category) and then fit a proportional odds model. Another approach is to use logistic regression to model the probability of a zero response and ordinary least squares linear regression to model the non-zero continuous responses. The use of these two approaches was demonstrated using outcomes data on hours of care received from the Springfield Elder Project. A crude linear model including both zero and non-zero values was also used for comparison. We conclude that the choice of approaches for analysis depends on the data. If the proportional odds assumption is valid, then it appears to be the method of choice; otherwise, the combination of logistic regression and a linear model is preferable.  相似文献   

18.
Scoring systems are used in nearly all fields of medicine for evaluation of the state of a disease. The prediction performance of scoring systems with respect to an ordinal outcome scale is investigated, based on grouped continuous logistic models as well as on an extension of the stereotype logistic regression model. The latter is a canonical approach, which allows assessment of properties of outcome categories such as partial and total ordering, distinguishability and allocatability. The approach is applied to a data set of patients with injuries of the head.  相似文献   

19.
Multivariate outcomes measured longitudinally over time are common in medicine, public health, psychology and sociology. The typical (saturated) longitudinal multivariate regression model has a separate set of regression coefficients for each outcome. However, multivariate outcomes are often quite similar and many outcomes can be expected to respond similarly to changes in covariate values. Given a set of outcomes likely to share common covariate effects, we propose the clustered outcome common predictor effect model and offer a two step iterative algorithm to fit the model using available software for univariate longitudinal data. Outcomes that share predictor effects need not be chosen a priori; we propose model selection tools to let the data select outcome clusters. We apply the proposed methods to psychometric data from adolescent children of HIV+ parents. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

20.
Regression with an ordered categorical response   总被引:1,自引:0,他引:1  
A survey on Mseleni joint disease in South Africa involved the scoring of pelvic X-rays of women to measure osteoporosis. The scores were ordinal by construction and ranged from 0 to 12. It is standard practice to use ordinary regression techniques with an ordinal response that has that many categories. We give evidence for these data that the constraints on the response result in a misleading regression analysis. McCullagh's proportional-odds model is designed specifically for the regression analysis of ordinal data. We demonstrate the technique on these data, and show how it fills the gap between ordinary regression and logistic regression (for discrete data with two categories). In addition, we demonstrate non-parametric versions of these models that do not make any linearity assumptions about the regression function.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号