首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The most commonly used models for categorical repeated measurement data are log-linear models. Not only are they easy to fit with standard software but they include such useful models as Markov chains and graphical models. However, these are conditional models and one often also requires the marginal probabilities of responses, for example, at each time point in a longitudinal study. Here a simple method of matrix manipulation is used to derive the maximum likelihood estimates of the marginal probabilities from any such conditional categorical repeated measures model. The technique is applied to the classical Muscatine data set, taking into account the dependence of missingness on previous observed values, as well as serial dependence and a random effect.  相似文献   

2.
Multiple imputation (MI) has become popular for analyses with missing data in medical research. The standard implementation of MI is based on the assumption of data being missing at random (MAR). However, for missing data generated by missing not at random mechanisms, MI performed assuming MAR might not be satisfactory. For an incomplete variable in a given data set, its corresponding population marginal distribution might also be available in an external data source. We show how this information can be readily utilised in the imputation model to calibrate inference to the population by incorporating an appropriately calculated offset termed the “calibrated-δ adjustment.” We describe the derivation of this offset from the population distribution of the incomplete variable and show how, in applications, it can be used to closely (and often exactly) match the post-imputation distribution to the population level. Through analytic and simulation studies, we show that our proposed calibrated-δ adjustment MI method can give the same inference as standard MI when data are MAR, and can produce more accurate inference under two general missing not at random missingness mechanisms. The method is used to impute missing ethnicity data in a type 2 diabetes prevalence case study using UK primary care electronic health records, where it results in scientifically relevant changes in inference for non-White ethnic groups compared with standard MI. Calibrated-δ adjustment MI represents a pragmatic approach for utilising available population-level information in a sensitivity analysis to explore potential departures from the MAR assumption.  相似文献   

3.
For binary or categorical response models, most goodness‐of‐fit statistics are based on the notion of partitioning the subjects into groups or regions and comparing the observed and predicted responses in these regions by a suitable chi‐squared distribution. Existing strategies create this partition based on the predicted response probabilities, or propensity scores, from the fitted model. In this paper, we follow a retrospective approach, borrowing the notion of balancing scores used in causal inference to inspect the conditional distribution of the predictors, given the propensity scores, in each category of the response to assess model adequacy. We can use this diagnostic under both prospective and retrospective sampling designs, and it may ascertain general forms of misspecification. We first present simple graphical and numerical summaries that can be used in a binary logistic model. We then generalize the tools to propose model diagnostics for the proportional odds model. We illustrate the methods with simulation studies and two data examples: (i) a case‐control study of the association between cumulative lead exposure and Parkinson's disease in the Boston, Massachusetts, area and (ii) and a cohort study of biomarkers possibly associated with diabetes, from the VA Normative Aging Study. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

4.
5.
Models for the ordered multiple categorical (OMC) response variable have already been extensively established and widely applied, but few studies have investigated linear regression problems with OMC predictors, especially in high-dimensional situations. In such settings, the pseudocategories of the discrete variable and other irrelevant explanatory variables need to be automatically selected. This paper introduces a transformation method of dummy variables for such OMC predictors, an L1 penalty regression method is proposed based on the transformation. Model selection consistency of the proposed method is derived under some common assumptions for high-dimensional situation. Both simulation studies and real data analysis present good performance of this method, showing its wide applicability in relevant regression analysis.  相似文献   

6.
This paper considers an index of hospital quality performance defined as the ratio of the observed number deaths to the number predicted by a fitted logistic regression model. We study tests and confidence intervals under two different scenarios depending on the availability of an estimate of the covariance matrix of the coefficints from the fitted logistic regression model. We propose parametric as well as bootstrap-based confidence intervals. We apply the methods to an analysis of the performance of 27 intensive care units.  相似文献   

7.
In constructing predictive models, investigators frequently assess the incremental value of a predictive marker by comparing the ROC curve generated from the predictive model including the new marker with the ROC curve from the model excluding the new marker. Many commentators have noticed empirically that a test of the two ROC areas often produces a non‐significant result when a corresponding Wald test from the underlying regression model is significant. A recent article showed using simulations that the widely used ROC area test produces exceptionally conservative test size and extremely low power. In this article, we demonstrate that both the test statistic and its estimated variance are seriously biased when predictions from nested regression models are used as data inputs for the test, and we examine in detail the reasons for these problems. Although it is possible to create a test reference distribution by resampling that removes these biases, Wald or likelihood ratio tests remain the preferred approach for testing the incremental contribution of a new marker. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

8.
This paper proposes a two-process log-linear model for analysis of polychotomous response data generated on study subjects assessed at successive discrete intervals. Response type at each discrete time may be either a transient response or a cause-specific failure. We view outcome of transient response as fundamentally different from outcome of failure, and, in a competing risk framework, we motivate a separate model for each: one to describe the process for transitions to transient response states and the other to describe the process for transitions to absorbing failure states. We maximize the likelihood for each model separately with use of existing software for iterative proportional fitting.  相似文献   

9.
This paper proposes a latent variable regression model for bivariate ordered categorical data and develops the necessary numerical procedure for parameter estimation. The proposed model is an extension of the standard bivariate probit model for dichotomous data to ordered categorical data with more than two categories for each margin. In addition, the proposed model allows for different covariates for the margins, which is characteristic of data from typical opthalmological studies. It utilizes the stochastic ordering implicit in the data and the correlation coefficient of the bivariate normal distribution in expressing intra-subject dependency. Illustration of the proposed model uses data from the Wisconsin Epidemiologic Study of Diabetic Retinopathy for identifying risk factors for diabetic retinopathy among younger-onset diabetics. The proposed regression model also applies to other clinical or epidemiological studies that involve paired organs.  相似文献   

10.
We develop regression methods for inference on conditional quantiles of time-to-transition in multistate processes. Special cases include survival, recurrent event, semicompeting, and competing risk data. We use an ad hoc representation of the underlying stochastic process, in conjunction with methods for censored quantile regression. In a simulation study, we demonstrate that the proposed approach has a superior finite sample performance over simple methods for censored quantile regression, which naively assume independence between states, and over methods for competing risks, even when the latter are applied to competing risk data settings. We apply our approach to data on hospital-acquired infections in cirrhotic patients, showing a quantile-dependent effect of catheterization on time to infection.  相似文献   

11.
In epidemiology, the risk of disease in terms of a set of covariates is often modelled by logistic regression. The resulting linear predictor can be used to define the extent of risk between extremes, and to calculate an attributable risk for the covariates taken together. As is well known, straightforward use of the linear predictor, on the sample from which it was derived, to obtain estimates the relative and attributable risk will be biased, often seriously. Use of the jack-knife technique is extended to produce asymptotically unbiased estimates of relative and attributable risks. The asymptotic variances associated with these estimates are derived by using the formulae of conditional variances. They are applied to the results of a case-control study of stomach cancer.  相似文献   

12.
Coupled metal speciation-fate models are an improvement over stand-alone fate-transport models for accurately assessing metal fate and transport. These coupled models estimate fate-controlling partition coefficients using geochemical speciation/complexation models. Commercially available geochemical models are practical options for a two-step, loose coupling with fate-transport models. These models differ in their partitioning estimates because of differences in assumptions, databases, and so on. The present study examines the effects of differences in estimates from geochemical models on estimates of cationic metal fate using two geochemical models: the Windermere humic aqueous model (WHAM) and the minicomputer equilibrium+ model (MINEQL+). The results from each geochemical model were used as input to the fate module of TRANSPEC (a general, coupled metal transport and speciation model). The two versions of the TRANSPEC model were then used to assess the fate of five cationic metals (Cd, Cu, Ni, Pb, and Zn) in Ross Lake (Flin Flon, MB, Canada; alkaline, eutrophic, mine impacted), Kelly Lake (Sudbury, ON, Canada; circumneutral, mesotrophic, mine influenced), and Lake Tantaré (Quebec City, QC, Canada; acidic, oligotrophic, pristine). For relatively soluble metals (Cd, Ni, and Zn), the WHAM and MINEQL+ estimates of speciation/complexation were similar for Ross and Kelly lakes but differed for Lake Tantaré. These differences, however, did not result in significant differences in overall fate estimates. Marked differences were observed between the WHAM and MINEQL+ estimates of partition coefficient, Kd, for more particle-reactive Cu and Pb that translated into the greatest impact on fate in mesotrophic Kelly Lake, in which particle movement is important for fate.  相似文献   

13.
Individual chemical logistic regression models were developed for 37 chemicals of potential concern in contaminated sediments to predict the probability of toxicity, based on the standard 10-d survival test for the marine amphipods Ampelisca abdita and Rhepoxynius abronius. These models were derived from a large database of matching sediment chemistry and toxicity data, which includes contaminant gradients from a variety of habitats in coastal North America. Chemical concentrations corresponding to a 20, 50, and 80% probability of observing sediment toxicity (T20, T50, and T80 values) were calculated to illustrate the potential for deriving application-specific sediment effect concentrations and to provide probability ranges for evaluating the reliability of the models. The individual chemical regression models were combined into a single model, using either the maximum (P(Max) model) or average (P(Avg) model) probability predicted from the chemicals analyzed in a sample, to estimate the probability of toxicity for a sample. The average predicted probability of toxicity (from the P(Max) model) within probability quartiles closely matched the incidence of toxicity within the same ranges, demonstrating the overall reliability of the P(Max) model for the database that was used to derive the model. The magnitude of the toxic effect (decreased survival) in the amphipod test increased as the predicted probability of toxicity increased. Users have a number of options for applying the logistic models, including estimating the probability of observing acute toxicity to estuarine and marine amphipods in 10-d toxicity tests at any given chemical concentration or estimating the chemical concentrations that correspond to specific probabilities of observing sediment toxicity.  相似文献   

14.
A survey of models for repeated ordered categorical response data   总被引:1,自引:0,他引:1  
We survey models for analysing repeated observations on an ordered categorical response variable. The models presented are univariate models that permit correlation among repeated measurements. The models describe simultaneously the dependence of marginal response distributions on values of explanatory variables and on the occasion of response. We present models for three transformations of the response distribution: cumulative logits, adjacent-category logits, and the mean for scores assigned to response categories. We discuss three methods for fitting the models: maximum likelihood, weighted least squares, and semi-parametric. Weighted least squares is easily implemented with SAS, as illustrated with a study designed to compare a drug with a placebo for the treatment of insomnia.  相似文献   

15.
Cure models have been applied to analyze clinical trials with cures and age‐at‐onset studies with nonsusceptibility. Lu and Ying (On semiparametric transformation cure model. Biometrika 2004; 91:331?‐343. DOI: 10.1093/biomet/91.2.331) developed a general class of semiparametric transformation cure models, which assumes that the failure times of uncured subjects, after an unknown monotone transformation, follow a regression model with homoscedastic residuals. However, it cannot deal with frequently encountered heteroscedasticity, which may result from dispersed ranges of failure time span among uncured subjects' strata. To tackle the phenomenon, this article presents semiparametric heteroscedastic transformation cure models. The cure status and the failure time of an uncured subject are fitted by a logistic regression model and a heteroscedastic transformation model, respectively. Unlike the approach of Lu and Ying, we derive score equations from the full likelihood for estimating the regression parameters in the proposed model. The similar martingale difference function to their proposal is used to estimate the infinite‐dimensional transformation function. Our proposed estimating approach is intuitively applicable and can be conveniently extended to other complicated models when the maximization of the likelihood may be too tedious to be implemented. We conduct simulation studies to validate large‐sample properties of the proposed estimators and to compare with the approach of Lu and Ying via the relative efficiency. The estimating method and the two relevant goodness‐of‐fit graphical procedures are illustrated by using breast cancer data and melanoma data. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

16.
When the difference between treatments in a clinical trial is estimated by a difference in means, then it is well known that randomization ensures unbiassed estimation, even if no account is taken of important baseline covariates. However, when the treatment effect is assessed by other summaries, for example by an odds ratio if the outcome is binary, then bias can arise if some covariates are omitted, regardless of the use of randomization for treatment allocation or the size of the trial. We present accurate closed‐form approximations for this asymptotic bias when important normally distributed covariates are omitted from a logistic regression. We compare this approximation with ones in the literature and derive more convenient forms for some of these existing results. The expressions give insight into the form of the bias, which simulations show is usable for distributions other than the normal. The key result applies even when there are additional binary covariates in the model. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

17.
18.
We provide a Bayesian analysis of data categorized into two levels of age (younger than 50 years, at least 50 years) and three levels of bone mineral density (normal, osteopenia, osteoporosis) for white females at least 20 years old in the third National Health and Nutrition Examination Survey. For the sample, the age of each individual is known, but some individuals did not have their BMD measured. We use two types of models: In the ignorable non-response models the propensity to respond does not depend on BMD and age of an individual, while in the non-ignorable non-response models it does. These are the baseline models which are used to derive all models for testing. Our non-ignorable non-response models are 'close' to the ignorable non-response models, thereby reducing the effects of the assumptions about non-respondents that cannot be tested in non-response models. We have data from 35 counties, small areas, and therefore our models are hierarchical, a feature that allows a 'borrowing of strength' across the counties, and they provide a substantial reduction in variation. The non-ignorable non-response models are generalizations of the ignorable non-response models, and therefore, the non-ignorable non-response models allow broader inference. The joint posterior density of the parameters for each model is complex, and therefore, we fit each model using Markov chain Monte Carlo methods to obtain samples which are used to make inference about BMD and age. For each county we can estimate the proportion of individuals in each BMD and age cell of the categorical table, and we can assess the relation between BMD and age using the Bayes factor. A sensitivity analysis shows that there are differences (typically small) in inference that permits different levels of association between BMD and age. A simulation study shows that there is not much difference between the baseline ignorable and non-ignorable non-response models.  相似文献   

19.
Flexible regression models with cubic splines   总被引:31,自引:0,他引:31  
We describe the use of cubic splines in regression models to represent the relationship between the response variable and a vector of covariates. This simple method can help prevent the problems that result from inappropriate linearity assumptions. We compare restricted cubic spline regression to non-parametric procedures for characterizing the relationship between age and survival in the Stanford Heart Transplant data. We also provide an illustrative example in cancer therapeutics.  相似文献   

20.
An introduction to multilevel regression models   总被引:1,自引:0,他引:1  
Data in health research are frequently structured hierarchically. For example, data may consist of patients nested within physicians, who in turn may be nested in hospitals or geographic regions. Fitting regression models that ignore the hierarchical structure of the data can lead to false inferences being drawn from the data. Implementing a statistical analysis that takes into account the hierarchical structure of the data requires special methodologies. In this paper, we introduce the concept of hierarchically structured data, and present an introduction to hierarchical regression models. We then compare the performance of a traditional regression model with that of a hierarchical regression model on a dataset relating test utilization at the annual health exam with patient and physician characteristics. In comparing the resultant models, we see that false inferences can be drawn by ignoring the structure of the data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号