首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Quality of life has been increasingly emphasized in public health research in recent years. Typically, the results of quality of life are measured by means of ordinal scales. In these situations, specific statistical methods are necessary because procedures such as either dichotomization or misinformation on the distribution of the outcome variable may complicate the inferential process. Ordinal logistic regression models are appropriate in many of these situations. This article presents a review of the proportional odds model, partial proportional odds model, continuation ratio model, and stereotype model. The fit, statistical inference, and comparisons between models are illustrated with data from a study on quality of life in 273 patients with schizophrenia. All tested models showed good fit, but the proportional odds or partial proportional odds models proved to be the best choice due to the nature of the data and ease of interpretation of the results. Ordinal logistic models perform differently depending on categorization of outcome, adequacy in relation to assumptions, goodness-of-fit, and parsimony.  相似文献   

Ordinal regression models for epidemiologic data   总被引:7,自引:0,他引:7  
Health status is often measured in epidemiologic studies on an ordinal scale, but data of this type are generally reduced for analysis to a single dichotomy. Several statistical models have been developed to make full use of information in ordinal response data, but have not been much used in analyzing epidemiologic studies. The authors discuss two of these statistical models--the cumulative odds model and the continuation ratio model. They may be interpreted in terms of odds ratios, can account for confounding variables, have clear and testable assumptions, and have parameters that may be estimated and hypotheses that may be tested using available statistical packages. However, calculations of asymptotic relative efficiency and results of simulations showed that simple logistic regression applied to dichotomized responses can in some realistic situations have more than 75% of the efficiency of ordinal regression models, but only if the ordinal scale is collapsed into a dichotomy close to the optimal point. The application of the proposed models to data from a study of chest x-rays of workers exposed to mineral fibers confirmed that they are easy to use and interpret, but gave results quite similar to those obtained using simple logistic regression after dichotomizing outcome in the conventional way.  相似文献   

BACKGROUND AND OBJECTIVES: The performance of a prediction model is usually worse in external validation data compared to the development data. We aimed to determine at which effective sample sizes (i.e., number of events) relevant differences in model performance can be detected with adequate power. METHODS: We used a logistic regression model to predict the probability that residual masses of patients treated for metastatic testicular cancer contained only benign tissue. We performed standard power calculations and Monte Carlo simulations to estimate the numbers of events that are required to detect several types of model invalidity with 80% power at the 5% significance level. RESULTS: A validation sample with 111 events was required to detect that a model predicted too high probabilities, when predictions were on average 1.5 times too high on the odds scale. A decrease in discriminative ability of the model, indicated by a decrease in the c-statistic from 0.83 to 0.73, required 81 to 106 events, depending on the specific scenario. CONCLUSION: We suggest a minimum of 100 events and 100 nonevents for external validation samples. Specific hypotheses may, however, require substantially higher effective sample sizes to obtain adequate power.  相似文献   

In an observational study focussed on association between a health outcome and numerous explanatory variables, the question of interactions can be problematic. Commonly, logistic regression of the outcome on the explanatory variables might be employed. Such modelling often includes an attempt to select some pairwise product interaction terms, from amongst the many such possible pairs. For several reasons, however, this can be unsatisfying. Here we consider a different approach based on a parsimonious extension of a logistic regression model without interaction terms. This extension permits an overall synergism or antagonism in how the explanatory variables combine to associate with the outcome, without any attempt to identify specific variables which give rise to interactive behaviour. We call this diffuse interaction. We elucidate some simple properties of the diffuse interaction model, and give an example of its application to epidemiological data. We also consider asymptotic behaviour in a restricted case of the model, to gain some insight into how well this kind of interaction can be detected from data.  相似文献   

Armstrong and Sloan have reviewed two types of ordinal logistic models for epidemiologic data: the cumulative-odds model and the continuation-ratio model. I review here certain aspects of these models not emphasized previously, and describe a third type, the stereotype model, which in certain situations offers greater flexibility coupled with interpretational advantages. I illustrate the models in an analysis of pneumoconiosis among coal miners.  相似文献   

Validation techniques for logistic regression models   总被引:4,自引:0,他引:4  
This paper presents a comprehensive approach to the validation of logistic prediction models. It reviews measures of overall goodness-of-fit, and indices of calibration and refinement. Using a model-based approach developed by Cox, we adapt logistic regression diagnostic techniques for use in model validation. This allows identification of problematic predictor variables in the prediction model as well as influential observations in the validation data that adversely affect the fit of the model. In appropriate situations, recommendations are made for correction of models that provide poor fit.  相似文献   

The focus of this paper is the application of statistical models to the study of socioeconomic conditioning factors in perinatal Chagas' disease conducted in Rosario, Argentina. A case (154) and control (158) design was applied to investigate socioeconomic and cultural differences in pregnant women in Hospital Roque Sáenz Pe?a as to their infection status. Logistic regression models were used to evaluate the importance of antecedents linked to the infection and socioeconomic and cultural factors for infection status. For pregnant women, the importance of antecedents linked to the infection was confirmed and the women's level of schooling stood out as the predominant socioeconomic condition associated with infection. Log-linear models were used to explore the associations between certain explanatory variables. This approach pointed up the most relevant associations between such factors and Chagas' disease and provided a better understanding of the framework of relationships among them.  相似文献   

A mixed-effects multinomial logistic regression model is described for analysis of clustered or longitudinal nominal or ordinal response data. The model is parameterized to allow flexibility in the choice of contrasts used to represent comparisons across the response categories. Estimation is achieved using a maximum marginal likelihood (MML) solution that uses quadrature to numerically integrate over the distribution of random effects. An analysis of a psychiatric data set, in which homeless adults with serious mental illness are repeatedly classified in terms of their living arrangement, is used to illustrate features of the model.  相似文献   

Su X 《Statistics in medicine》2007,26(10):2154-2169
A tree procedure is proposed to check the adequacy of a fitted logistic regression model. The proposed method not only makes natural assessment for the logistic model, but also provides clues to amend its lack-of-fit. The resulting tree-augmented logistic model facilitates a refined model with meaningful interpretation. We demonstrate its use via simulation studies and an application to the Pima Indians diabetes data.  相似文献   

For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination.  相似文献   

We compare parameter estimates from the proportional hazards model, the cumulative logistic model and a new modified logistic model (referred to as the person-time logistic model), with the use of simulated data sets and with the following quantities varied: disease incidence, risk factor strength, length of follow-up, the proportion censored, non-proportional hazards, and sample size. Parameter estimates from the person-time logistic regression model closely approximated those from the Cox model when the survival time distribution was close to exponential, but could differ substantially in other situations. We found parameter estimates from the cumulative logistic model similar to those from the Cox and person-time logistic models when the disease was rare, the risk factor moderate, and censoring rates similar across the covariates. We also compare the models with analysis of a real data set that involves the relationship of age, race, sex, blood pressure, and smoking to subsequent mortality. In this example, the length of follow-up among survivors varied from 5 to 14 years and the Cox and person-time logistic approaches gave nearly identical results. The cumulative logistic results had somewhat larger p-values but were substantively similar for all but one coefficient (the age-race interaction). The latter difference reflects differential censoring rates by age, race and sex.  相似文献   

Correlation is inherent in longitudinal studies due to the repeated measurements on subjects, as well as due to time-dependent covariates in the study. In the National Longitudinal Study of Adolescent to Adult Health (Add Health), data were repeatedly collected on children in grades 7-12 across four waves. Thus, observations obtained on the same adolescent were correlated, while predictors were correlated with current and future outcomes such as obesity status, among other health issues. Previous methods, such as the generalized method of moments (GMM) approach have been proposed to estimate regression coefficients for time-dependent covariates. However, these approaches combined all valid moment conditions to produce an averaged parameter estimate for each covariate and thus assumed that the effect of each covariate on the response was constant across time. This assumption is not necessarily optimal in applications such as Add Health or health-related data. Thus, we depart from this assumption and instead use the Partitioned GMM approach to estimate multiple coefficients for the data based on different time periods. These extra regression coefficients are obtained using a partitioning of the moment conditions pertaining to each respective relationship. This approach offers a deeper understanding and appreciation into the effect of each covariate on the response. We conduct simulation studies, as well as analyses of obesity in Add Health, rehospitalization in Medicare data, and depression scores in a clinical study. The Partitioned GMM methods exhibit benefits over previously proposed models with improved insight into the nonconstant relationships realized when analyzing longitudinal data.  相似文献   

We examine the properties of several tests for goodness-of-fit for multinomial logistic regression. One test is based on a strategy of sorting the observations according to the complement of the estimated probability for the reference outcome category and then grouping the subjects into g equal-sized groups. A g x c contingency table, where c is the number of values of the outcome variable, is constructed. The test statistic, denoted as Cg, is obtained by calculating the Pearson chi2 statistic where the estimated expected frequencies are the sum of the model-based estimated logistic probabilities. Simulations compare the properties of Cg with those of the ungrouped Pearson chi2 test (X2) and its normalized test (z). The null distribution of Cg is well approximated by the chi2 distribution with (g-2) x (c-1) degrees of freedom. The sampling distribution of X2 is compared with a chi2 distribution with n x (c-1) degrees of freedom but shows erratic behavior. With a few exceptions, the sampling distribution of z adheres reasonably well to the standard normal distribution. Power simulations show that Cg has low power for a sample of 100 observations, but satisfactory power for a sample of 400. The tests are illustrated using data from a study of cytological criteria for the diagnosis of breast tumors.  相似文献   

Logistic regression models are widely used in medicine for predicting patient outcome (prognosis) and constructing diagnostic tests (diagnosis). Multivariable logistic models yield an (approximately) continuous risk score, a transformation of which gives the estimated event probability for an individual. A key aspect of model performance is discrimination, that is, the model's ability to distinguish between patients who have (or will have) an event of interest and those who do not (or will not). Graphical aids are important in understanding a logistic model. The receiver‐operating characteristic (ROC) curve is familiar, but not necessarily easy to interpret. We advocate a simple graphic that provides further insight into discrimination, namely a histogram or dot plot of the risk score in the outcome groups. The most popular performance measure for the logistic model is the c‐index, numerically equivalent to the area under the ROC curve. We discuss the comparative merits of the c‐index and the (standardized) mean difference in risk score between the outcome groups. The latter statistic, sometimes known generically as the effect size, has been computed in slightly different ways by several different authors, including Glass, Cohen and Hedges. An alternative measure is the overlap between the distributions in the outcome groups, defined as the area under the minimum of the two density functions. The larger the overlap, the weaker the discrimination. Under certain assumptions about the distribution of the risk score, the c‐index, effect size and overlap are functionally related. We illustrate the ideas with simulated and real data sets. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

Although a wide variety of change-point models are available for continuous outcomes, few models are available for dichotomous outcomes. This paper introduces transition methods for logistic regression models in which the dose-response relationship follows two different straight lines, which may intersect or may present a jump at an unknown change-point. In these models, the logit includes a differentiable transition function that provides parametric control of the sharpness of the transition at the change-point, allowing for abrupt changes or more gradual transitions between the two different linear trends, as well as for estimation of the location of the change-point. Linear-linear logistic models are particular cases of the proposed transition models. We present a modified iteratively reweighted least squares algorithm to estimate model parameters, and we provide inference procedures including a test for the existence of the change-point. These transition models are explored in a simulation study, and they are used to evaluate the existence of a change-point in the association between plasma glucose after an oral glucose tolerance test and mortality using data from the Mortality Follow-up of the Second National Health and Nutrition Examination Survey.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号