首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Longitudinal studies are helpful in understanding how subtle associations between factors of interest change over time. Our goal is to apply statistical methods which are appropriate for analyzing longitudinal data to a repeated measures epidemiological study as a tutorial in the appropriate use and interpretation of random effects models. To motivate their use, we study the association of alcohol consumption on markers of HIV disease progression in an observational cohort. To make valid inferences, the association among measurements correlated within a subject must be taken into account. We describe a linear mixed effects regression framework that accounts for the clustering of longitudinal data and that can be fit using standard statistical software. We apply the linear mixed effects model to a previously published dataset of HIV infected individuals with a history of alcohol problems who are receiving HAART (n = 197). The researchers were interested in determining the effect of alcohol use on HIV disease progression over time. Fitting a linear mixed effects multiple regression model with a random intercept and random slope for each subject accounts for the association of observations within subjects and yields parameters interpretable as in ordinary multiple regression. A significant interaction between alcohol use and adherence to HAART is found: subjects who use alcohol and are not fully adherent to their HIV medications had higher log RNA (ribonucleic acid) viral load levels than fully adherent non-drinkers, fully adherent alcohol users, and non-drinkers who were not fully adherent. Longitudinal studies are increasingly common in epidemiological research. Software routines that account for correlation between repeated measures using linear mixed effects methods are now generally available and straightforward to utilize. These models allow the relaxation of assumptions needed for approaches such as repeated measures ANOVA, and should be routinely incorporated into the analysis of cohort studies.  相似文献   

2.
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data‐driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed‐effects and mixed‐effects linear and nonlinear models for cross‐sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

3.
Different measures of the proportion of variation in a dependent variable explained by covariates are reported by different standard programs for logistic regression. We review twelve measures that have been suggested or might be useful to measure explained variation in logistic regression models. The definitions and properties of these measures are discussed and their performance is compared in an empirical study. Two of the measures (squared Pearson correlation between the binary outcome and the predictor, and the proportional reduction of squared Pearson residuals by the use of covariates) give almost identical results, agree very well with the multiple R2 of the general linear model, have an intuitively clear interpretation and perform satisfactorily in our study. For all measures the explained variation for the given sample and also the one expected in future samples can be obtained easily. For small samples an adjustment analogous to R2adj in the general linear model is suggested. We discuss some aspects of application and recommend the routine use of a suitable measure of explained variation for logistic models.  相似文献   

4.
When studying the association between an exposure and an outcome, it is common to use regression models to adjust for measured confounders. The most common models in epidemiologic research are logistic regression and Cox regression, which estimate conditional (on the confounders) odds ratios and hazard ratios. When the model has been fitted, one can use regression standardization to estimate marginal measures of association. If the measured confounders are sufficient for confounding control, then the marginal association measures can be interpreted as poulation causal effects. In this paper we describe a new R package, stdReg, that carries out regression standardization with generalized linear models (e.g. logistic regression) and Cox regression models. We illustrate the package with several examples, using real data that are publicly available.  相似文献   

5.
In cluster‐randomised trials, the problem of non‐independence within clusters is well known, and appropriate statistical analysis documented. Clusters typically seen in cluster trials are large in size and few in number, whereas datasets of preterm infants incorporate clusters of size two (twins), size three (triplets) and so on, with the majority of infants being in ‘clusters’ of size one. In such situations, it is unclear whether adjustment for clustering is needed or even possible. In this paper, we compared analyses allowing for clustering (linear mixed model) with analyses ignoring clustering (linear regression). Through simulations based on two real datasets, we explored estimation bias in predictors of a continuous outcome in different size datasets typical of preterm samples, with varying percentages of twins. Overall, the biases for estimated coefficients were similar for linear regression and mixed models, but the standard errors were consistently much less well estimated when using a linear model. Non‐convergence was rare but was observed in approximately 5% of mixed models for samples below 200 and percentage of twins 2% or less. We conclude that in datasets with small clusters, mixed models should be the method of choice irrespective of the percentage of twins. If the mixed model does not converge, a linear regression can be fitted, but standard error will be underestimated, and so type I error may be inflated. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

6.
7.
Longitudinal studies aimed at assessing the impact of interventions on disease risk factors often confront several statistical problems. These problems include 1) dependent variables measured by ordered categories, 2) numerous potentially relevant patterns of transition between outcome levels, 3) mixed units of analysis (e.g., assignment by social unit while theorizing in terms of individuals), 4) incomplete randomization, and 5) correlated estimates for successive occasions of longitudinal measurement. Longitudinal data on use of cigarettes, alcohol, and marijuana among adolescents (n = 1,244, complete data) from the Midwestern Prevention Project are used to demonstrate solutions to each of these problems: 1) a proportional odds regression model, 2) conditional logistic models of transitions with interactions between baseline level and intervention effect, 3) a logistic model estimated with linear regression methods on measures aggregated by social unit, 4) conditional and unconditional models of effect magnitude, and 5) a repeated measures logistic regression technique. Panel data fit to the various models yielded the following conclusions concerning intervention effects in the Midwestern Prevention Project: reduction in the prevalence of cigarette users in treatment schools compared with control schools (8% vs. 18% smoked in the last week at one year follow-up), mixed evidence of an effect on marijuana use, and no evidence of an effect on alcohol use.  相似文献   

8.
9.
目的研究广义线性混合模型在煤工尘肺发病影响因素的应用,为煤工尘肺及类似疾病的发病影响因素研究提供新的方法。方法收集煤矿集团所属8个煤矿所有接尘矿工的资料,用SAS进行卡方检验、Logistic回归进行尘肺发病影响因素研究,并与SAS nlmixed模块实现的广义混合现行模型结果进行比较。结果单因素分析、Logistic回归和广义线性混合模型显示:工龄、接尘时间和工种均为尘肺发病的影响因素,工作地点作为随机效应结果差异无统计学意义。结论广义线性混合模型因考虑了不同工人工作地点的不同和随机效应,结果虽然和其他方法一致,但是更有说服力,与其他方法相比是研究尘肺发病影响因素较好的方法。  相似文献   

10.
In health services research, it is common to encounter semicontinuous data characterized by a point mass at zero followed by a continuous distribution with positive support. These are often analyzed using two-part mixtures that separately model the probability of use to account for the portion of the sample with zero values. Commonly, but not always, the second component models the continuous values conditional on them being positive. Prior work examining whether such two-part models are needed to appropriately draw inference from semicontinuous data compared to standard one-part regression models has found mixed results. However, prior studies have generally used only measures of model fit on a single dataset, leaving a definitive conclusion uncertain. This paper provides a detailed evaluation using simulations of the appropriateness of standard one-part generalized linear models (GLMs) compared to a recently developed marginalized two-part (MTP) model. The MTP model, unlike the one-part GLMs, explicitly accounts for the point mass at zero, yet takes the same form for the marginal mean as the commonly used GLM with log link, making the covariate effects directly comparable. We simulate data scenarios with varying sample sizes and percentages of zeros. One-part GLMs resulted in increased bias, lower than nominal coverage of confidence intervals, and inflated type I error rates, rendering them inappropriate for use with semicontinuous data. Even when distributional assumptions were violated, estimates of covariate effects and type I error rates under the MTP model remained robust.  相似文献   

11.
Twin studies have long been recognized for their value in learning about the aetiology of disease and specifically for their potential for separating genetic effects from environmental effects. The recent upsurge of interest in life-course epidemiology and the study of developmental influences on later health has provided a new impetus to study twins as a source of unique insights. Twins are of special interest because they provide naturally matched pairs where the confounding effects of a large number of potentially causal factors (such as maternal nutrition or gestation length) may be removed by comparisons between twins who share them. The traditional tool of epidemiological 'risk factor analysis' is the regression model, but it is not straightforward to transfer standard regression methods to twin data, because the analysis needs to reflect the paired structure of the data, which induces correlation between twins. This paper reviews the use of more specialized regression methods for twin data, based on generalized least squares or linear mixed models, and explains the relationship between these methods and the commonly used approach of analysing within-twin-pair difference values. Methods and issues of interpretation are illustrated using an example from a recent study of the association between birth weight and cord blood erythropoietin. We focus on the analysis of continuous outcome measures but review additional complexities that arise with binary outcomes. We recommend the use of a general model that includes separate regression coefficients for within-twin-pair and between-pair effects, and provide guidelines for the interpretation of estimates obtained under this model.  相似文献   

12.
There is growing international evidence that supportive built environments encourage active travel such as walking. An unsettled question is the role of geographic regions for analyzing the relationship between the built environment and active travel. This paper examines the geographic region question by assessing walking trip models that use two different regions: walking activity spaces and self-defined neighborhoods. We also use two types of built environment metrics, perceived and audit data, and two types of study design, cross-sectional and longitudinal, to assess these regions. We find that the built environment associations with walking are dependent on the type of metric and the type of model. Audit measures summarized within walking activity spaces better explain walking trips compared to audit measures within self-defined neighborhoods. Perceived measures summarized within self-defined neighborhoods have mixed results. Finally, results differ based on study design. This suggests that results may not be comparable among different regions, metrics and designs; researchers need to consider carefully these choices when assessing active travel correlates.  相似文献   

13.
The general linear mixed model provides a useful approach for analysing a wide variety of data structures which practising statisticians often encounter. Two such data structures which can be problematic to analyse are unbalanced repeated measures data and longitudinal data. Owing to recent advances in methods and software, the mixed model analysis is now readily available to data analysts. The model is similar in many respects to ordinary multiple regression, but because it allows correlation between the observations, it requires additional work to specify models and to assess goodness-of-fit. The extra complexity involved is compensated for by the additional flexibility it provides in model fitting. The purpose of this tutorial is to provide readers with a sufficient introduction to the theory to understand the method and a more extensive discussion of model fitting and checking in order to provide guidelines for its use. We provide two detailed case studies, one a clinical trial with repeated measures and dropouts, and one an epidemiological survey with longitudinal follow-up. © 1997 John Wiley & Sons, Ltd.  相似文献   

14.
Comparative trials that report binary outcome data are commonly pooled in systematic reviews and meta‐analyses. This type of data can be presented as a series of 2‐by‐2 tables. The pooled odds ratio is often presented as the outcome of primary interest in the resulting meta‐analysis. We examine the use of 7 models for random‐effects meta‐analyses that have been proposed for this purpose. The first of these models is the conventional one that uses normal within‐study approximations and a 2‐stage approach. The other models are generalised linear mixed models that perform the analysis in 1 stage and have the potential to provide more accurate inference. We explore the implications of using these 7 models in the context of a Cochrane Review, and we also perform a simulation study. We conclude that generalised linear mixed models can result in better statistical inference than the conventional 2‐stage approach but also that this type of model presents issues and difficulties. These challenges include more demanding numerical methods and determining the best way to model study specific baseline risks. One possible approach for analysts is to specify a primary model prior to performing the systematic review but also to present the results using other models in a sensitivity analysis. Only one of the models that we investigate is found to perform poorly so that any of the other models could be considered for either the primary or the sensitivity analysis.  相似文献   

15.
This paper is concerned with regression models for correlated mixed discrete and continuous outcomes constructed using copulas. Our approach entails specifying marginal regression models for the outcomes, and combining them via a copula to form a joint model. Specifically, we propose marginal regression models (e.g. generalized linear models) to link the outcomes' marginal means to covariates. To account for associations between outcomes, we adopt the Gaussian copula to indirectly specify their joint distributions. Our approach has two advantages over current methods: one, regression parameters in models for both outcomes are marginally meaningful, and two, the association is 'margin-free', in the sense that it is characterized by the copula alone. By assuming a latent variable framework to describe discrete outcomes, the copula used still uniquely determines the joint distribution. In addition, association measures between outcomes can be interpreted in the usual way. We report results of simulations concerning the bias and efficiency of two likelihood-based estimation methods for the model. Finally, we illustrate the model using data on burn injuries.  相似文献   

16.
The linear mixed effects model with normal errors is a popular model for the analysis of repeated measures and longitudinal data. The generalized linear model is useful for data that have non-normal errors but where the errors are uncorrelated. A descendant of these two models generates a model for correlated data with non-normal errors, called the generalized linear mixed model (GLMM). Frequentist attempts to fit these models generally rely on approximate results and inference relies on asymptotic assumptions. Recent advances in computing technology have made Bayesian approaches to this class of models computationally feasible. Markov chain Monte Carlo methods can be used to obtain ‘exact’ inference for these models, as demonstrated by Zeger and Karim. In the linear or generalized linear mixed model, the random effects are typically taken to have a fully parametric distribution, such as the normal distribution. In this paper, we extend the GLMM by allowing the random effects to have a non-parametric prior distribution. We do this using a Dirichlet process prior for the general distribution of the random effects. The approach easily extends to more general population models. We perform computations for the models using the Gibbs sampler. © 1998 John Wiley & Sons, Ltd.  相似文献   

17.
《Contraception》2008,77(6):425-431
BackgroundWomen using injectable progestin contraceptives (IPCs) have lower bone mineral density than nonusers. We assessed whether bone loss is completely reversible after cessation of IPC use, whether different IPCs have different effects and whether effects vary by age at first use.Study DesignIn a cross-sectional study in Cape Town, South Africa, 3487 premenopausal black and mixed race women aged 18–44 years were interviewed for information on contraceptive history and risk factors for decreased bone mineral density, and ultrasound measurements of the left calcaneus were taken. Adjusted means of the ultrasound measures for categories of IPC use were obtained using multivariable linear regression.ResultsCurrent users of IPCs had the lowest ultrasound measures, while the measures of women who had ceased IPC use at least 2–3 years previously were similar to or greater than those of never users of IPCs. The effects of depot medroxyprogesterone acetate and norethisterone enanthate were similar. The calcaneus measures were unrelated to age at which use began after control for confounding factors.ConclusionThe data suggest that bone loss during IPC use is reversible and that this loss of bone is completely recovered several years after cessation of use.  相似文献   

18.
Patient reported outcome and observer evaluative studies in clinical trials and post‐hoc analyses often use instruments that measure responses on ordinal‐rating or Likert scales. We propose a flexible distributional approach by modeling the change scores from the baseline to the end of the study using independent beta distributions. The two shape parameters of the fitted beta distributions are estimated by matching‐moments. Covariates and the interaction terms are included in multivariate beta‐regression analyses under generalized linear mixed models. These methods are illustrated on the treatment satisfaction data in an overactive bladder drug study with four treatment arms. Monte‐Carlo simulations were conducted to compare the Type 1 errors and statistical powers using a beta likelihood ratio test of the proposed method against its fully nonparametric or parametric alternatives. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

19.
Repeated measures data are frequently incomplete, unbalanced and correlated. There has been a great deal of recent interest in mixed effects models for analysing such data. In this paper, we develop bivariate response mixed effects models that are a generalization of linear mixed effects models for a single response variable. We describe their estimation procedures using a Markov chain Monte Carlo method, the Gibbs sampler. We illustrate the methods with analyses of intravenous vitamin D3 administration for secondary hyperparathyroidism in hemodialysis patients. In these data there were two response variables on each individual (PTH and calcium level). This study also suffered from attrition, like many longitudinal studies. While, considering the study design, it was reasonable to assume the drop-out mechanism for the calcium (Ca) level to be ‘missing at random’, the drop-out mechanism for the PTH level was likely to be non-ignorable. We found that the posterior treatment effects for the PTH level by the single response model were underestimated compared with those obtained by the bivariate response model, while there were little differences in the posterior features for the Ca level under both models. © 1997 John Wiley & Sons Ltd.  相似文献   

20.
The terms multivariate and multivariable are often used interchangeably in the public health literature. However, these terms actually represent 2 very distinct types of analyses. We define the 2 types of analysis and assess the prevalence of use of the statistical term multivariate in a 1-year span of articles published in the American Journal of Public Health. Our goal is to make a clear distinction and to identify the nuances that make these types of analyses so distinct from one another.Open in a separate windowOpen in a separate windowMost regression models are described in terms of the way the outcome variable is modeled: in linear regression the outcome is continuous, logistic regression has a dichotomous outcome, and survival analysis involves a time to event outcome. Statistically speaking, multivariate analysis refers to statistical models that have 2 or more dependent or outcome variables,1 and multivariable analysis refers to statistical models in which there are multiple independent or response variables.2A multivariable model can be thought of as a model in which multiple variables are found on the right side of the model equation. This type of statistical model can be used to attempt to assess the relationship between a number of variables; one can assess independent relationships while adjusting for potential confounders.A simple linear regression model has a continuous outcome and one predictor, whereas a multiple or multivariable linear regression model has a continuous outcome and multiple predictors (continuous or categorical). A simple linear regression model would have the formBy contrast, a multivariable or multiple linear regression model would take the formwhere y is a continuous dependent variable, x is a single predictor in the simple regression model, and x1, x2, …, xk are the predictors in the multivariable model.As is the case with linear models, logistic and proportional hazards regression models can be simple or multivariable. Each of these model structures has a single outcome variable and 1 or more independent or predictor variables.Multivariate, by contrast, refers to the modeling of data that are often derived from longitudinal studies, wherein an outcome is measured for the same individual at multiple time points (repeated measures), or the modeling of nested/clustered data, wherein there are multiple individuals in each cluster. A multivariate linear regression model would have the formwhere the relationships between multiple dependent variables (i.e., Ys)—measures of multiple outcomes—and a single set of predictor variables (i.e., Xs) are assessed.We took a systematic approach to assessing the prevalence of use of the statistical term multivariate. That is, we used PubMed and the keyword “multivariate” to review articles published in the American Journal of Public Health over a 1-year span (December 2010–November 2011). We identified 30 articles in which the authors indicated the use of a “multivariate” statistical method. Each of the articles was individually reviewed to assess the type of analysis defined as multivariate.In 5 (17%) of the 30 articles, multivariate models (as we have defined them here) were used; 4 (13%) of these models were derived from longitudinal data and 1 from nested data. The remaining 25 (83%) articles involved multivariable analyses; logistic regression (21 of 30, or 70%) was the most prominent type of analysis used, followed by linear regression (3 of 30, or 10%). Interestingly, in 2 of the 30 articles (7%), the terms multivariate and multivariable were used interchangeably. This further elucidates the need to establish consistency in use of the 2 statistical terms.Although some may argue that the interchangeable use of multivariate and multivariable is simply semantics, we believe that differentiating between the 2 terms is important for the field of public health. In general, models used in public health research should be described as simple or multivariable, to indicate the number of predictors, and as linear, logistic, multivariate, or proportional hazards, to indicate the type of outcome (e.g., continuous, dichotomous, repeated measures, time to event).Our review revealed that there is a need for more accurate application and reporting of multivariable methods. This issue is not unique to public health research and has been identified as affecting other areas of research as well (e.g., medicine, psychology, political science).3 However, we hope to see a clearer distinction in the use of the terms multivariate and multivariable to describe statistical analyses in future public health literature. This is an important distinction not only to avoid confusion among readers but to more accurately inform the next generation of public health researchers who are seeking to ground their work in the published literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号