期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Double-robust estimation of an exposure-outcome odds ratio adjusting for confounding in cohort and case-control studies

Tchetgen Tchetgen EJ Rotnitzky A 《Statistics in medicine》2011,30(4):335-347

Modern epidemiologic studies often aim to evaluate the causal effect of a point exposure on the risk of a disease from cohort or case-control observational data. Because confounding bias is of serious concern in such non-experimental studies, investigators routinely adjust for a large number of potential confounders in a logistic regression analysis of the effect of exposure on disease outcome. Unfortunately, when confounders are not correctly modeled, standard logistic regression is likely biased in its estimate of the effect of exposure, potentially leading to erroneous conclusions. We partially resolve this serious limitation of standard logistic regression analysis with a new iterative approach that we call ProRetroSpective estimation, which carefully combines standard logistic regression with a logistic regression analysis in which exposure is the dependent variable and the outcome and confounders are the independent variables. As a result, we obtain a correct estimate of the exposure-outcome odds ratio, if either thestandard logistic regression of the outcome given exposure and confounding factors is correct, or the regression model of exposure given the outcome and confounding factors is correct but not necessarily both, that is, it is double-robust. In fact, it also has certain advantadgeous efficiency properties. The approach is general in that it applies to both cohort and case-control studies whether the design of the study is matched or unmatched on a subset of covariates. Finally, an application illustrates the methods using data from the National Cancer Institute's Black/White Cancer Survival Study. 相似文献

2.

Doubly robust conditional logistic regression

Johan Zetterqvist Karel Vermeulen Stijn Vansteelandt Arvid Sjölander 《Statistics in medicine》2019,38(23):4749-4760

Epidemiologic research often aims to estimate the association between a binary exposure and a binary outcome, while adjusting for a set of covariates (eg, confounders). When data are clustered, as in, for instance, matched case-control studies and co-twin-control studies, it is common to use conditional logistic regression. In this model, all cluster-constant covariates are absorbed into a cluster-specific intercept, whereas cluster-varying covariates are adjusted for by explicitly adding these as explanatory variables to the model. In this paper, we propose a doubly robust estimator of the exposure-outcome odds ratio in conditional logistic regression models. This estimator protects against bias in the odds ratio estimator due to misspecification of the part of the model that contains the cluster-varying covariates. The doubly robust estimator uses two conditional logistic regression models for the odds ratio, one prospective and one retrospective, and is consistent for the exposure-outcome odds ratio if at least one of these models is correctly specified, not necessarily both. We demonstrate the properties of the proposed method by simulations and by re-analyzing a publicly available dataset from a matched case-control study on induced abortion and infertility. 相似文献

3.

Evaluating haplotype effects in case‐control studies via penalized‐likelihood approaches: prospective or retrospective analysis?

Koehler ML Bondell HD Tzeng JY 《Genetic epidemiology》2010,34(8):892-911

Penalized likelihood methods have become increasingly popular in recent years for evaluating haplotype-phenotype association in case-control studies. Although a retrospective likelihood is dictated by the sampling scheme, these penalized methods are typically built on prospective likelihoods due to their modeling simplicity and computational feasibility. It has been well documented that for unpenalized methods, prospective analyses of case-control data can be valid but less efficient than their retrospective counterparts when testing for association, and result in substantial bias when estimating the haplotype effects. For penalized methods, which combine effect estimation and testing in one step, the impact of using a prospective likelihood is not clear. In this work, we examine the consequences of ignoring the sampling scheme for haplotype-based penalized likelihood methods. Our results suggest that the impact of prospective analyses depends on (1) the underlying genetic mode and (2) the genetic model adopted in the analysis. When the correct genetic model is used, the difference between the two analyses is negligible for additive and slight for dominant haplotype effects. For recessive haplotype effects, the more appropriate retrospective likelihood clearly outperforms the prospective likelihood. If an additive model is incorrectly used, as the true underlying genetic mode is unknown a priori, both retrospective and prospective penalized methods suffer from a sizeable power loss and increase in bias. The impact of using the incorrect genetic model is much bigger on retrospective analyses than prospective analyses, and results in comparable performances for both methods. An application of these methods to the Genetic Analysis Workshop 15 rheumatoid arthritis data is provided. 相似文献

4.

Adjusting risk factors in spontaneous abortion by multiple logistic regression

V. Dominguez E. Calls P. Ortega P. Astasio J. Valero De Bernabè J. Rey Calero 《European journal of epidemiology》1991,7(2):171-174

A cross-sectional case-control study was performed to identify some obstetric and gynaecologic factors that can influence spontaneous abortion. Statistical and epidemiologic analyses were done by multiple logistic regression to adjust OR through the coefficient. A dicotomized outcome variable, representing spontaneous abortion, and different independent variables, representing distinct medical factors, were designed. The analysis was carried out with a personal computer and an appropriate statistic package.The variables representing age over 35 and previous spontaneous abortions were shown to be risk factors, adjusted for the rest of variables. The variables representing parity and late menarcheal age lost significance when they were adjusted with multiple logistic regression.Corresponding author. 相似文献

5.

Statistical methods in epidemiology: a comparison of statistical methods to analyze dose-response and trend analysis in epidemiologic studies

Boucher KM Slattery ML Berry TD Quesenberry C Anderson K 《Journal of clinical epidemiology》1998,51(12):1223-1233

Evaluation of various statistical methods to describe accurately associations between exposures and disease are constantly being explored. Spline regression has been suggested as an alternative to using categorized variables in studies of disease etiology, as it uses all data points to estimate the shape of the association between a given exposure and disease outcome. It has been proposed that this method is especially beneficial when associations are concentrated in a small range of the overall distribution of the exposure. In this study, we use data from a large case-control study of colon cancer to evaluate associations obtained from logistic regression models that use spline regression for main exposure and confounder effects with those that use categorized variables for main exposure. Our results show that for variables for which the association appears to be linear, such as body size and dietary intake of calcium, fiber, and cholesterol, associations are similar when estimates are generated from spline or categorized variable models. For other variables, such as total energy intake, for which associations appear to be strongest in the upper end of the distribution, estimates of association appear to be conservative when using categorized variables. The data also suggest that selection of cut points for the categorized variables may have an impact on the associations observed. Spline regression appears to be useful to estimate the shape of the association between a given exposure and disease and may provide guidance as to the appropriateness of using categorized variables. The risk estimates from spline regression appear to be similar to those from traditional categorical methods. When effects are large or rapidly changing, spline models may more appropriately describe the association. 相似文献

6.

Dose‐response analyses using restricted cubic spline functions in public health research

Loic Desquilbet François Mariotti 《Statistics in medicine》2010,29(9):1037-1057

Taking into account a continuous exposure in regression models by using categorization, when non‐linear dose‐response associations are expected, have been widely criticized. As one alternative, restricted cubic spline (RCS) functions are powerful tools (i) to characterize a dose‐response association between a continuous exposure and an outcome, (ii) to visually and/or statistically check the assumption of linearity of the association, and (iii) to minimize residual confounding when adjusting for a continuous exposure. Because their implementation with SAS® software is limited, we developed and present here an SAS macro that (i) creates an RCS function of continuous exposures, (ii) displays graphs showing the dose‐response association with 95 per cent confidence interval between one main continuous exposure and an outcome when performing linear, logistic, or Cox models, as well as linear and logistic‐generalized estimating equations, and (iii) provides statistical tests for overall and non‐linear associations. We illustrate the SAS macro using the third National Health and Nutrition Examination Survey data to investigate adjusted dose‐response associations (with different models) between calcium intake and bone mineral density (linear regression), folate intake and hyperhomocysteinemia (logistic regression), and serum high‐density lipoprotein cholesterol and cardiovascular mortality (Cox model). Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

7.

Covariate adjustment in randomized trials with binary outcomes: targeted maximum likelihood estimation

Moore KL van der Laan MJ 《Statistics in medicine》2009,28(1):39-64

Covariate adjustment using linear models for continuous outcomes in randomized trials has been shown to increase efficiency and power over the unadjusted method in estimating the marginal effect of treatment. However, for binary outcomes, investigators generally rely on the unadjusted estimate as the literature indicates that covariate-adjusted estimates based on the logistic regression models are less efficient. The crucial step that has been missing when adjusting for covariates is that one must integrate/average the adjusted estimate over those covariates in order to obtain the marginal effect. We apply the method of targeted maximum likelihood estimation (tMLE) to obtain estimators for the marginal effect using covariate adjustment for binary outcomes. We show that the covariate adjustment in randomized trials using the logistic regression models can be mapped, by averaging over the covariate(s), to obtain a fully robust and efficient estimator of the marginal effect, which equals a targeted maximum likelihood estimator. This tMLE is obtained by simply adding a clever covariate to a fixed initial regression. We present simulation studies that demonstrate that this tMLE increases efficiency and power over the unadjusted method, particularly for smaller sample sizes, even when the regression model is mis-specified. 相似文献

8.

WEIGHTED LIKELIHOOD,PSEUDO-LIKELIHOOD AND MAXIMUM LIKELIHOOD METHODS FOR LOGISTIC REGRESSION ANALYSIS OF TWO-STAGE DATA

NORMAN E. BRESLOW RICHARD HOLUBKOV 《Statistics in medicine》1997,16(1):103-116

General approaches to the fitting of binary response models to data collected in two-stage and other stratified sampling designs include weighted likelihood, pseudo-likelihood and full maximum likelihood. In previous work the authors developed the large sample theory and methodology for fitting of logistic regression models to two-stage case-control data using full maximum likelihood. The present paper describes computational algorithms that permit efficient estimation of regression coefficients using weighted, pseudo- and full maximum likelihood. It also presents results of a simulation study involving continuous covariables where maximum likelihood clearly outperformed the other two methods and discusses the analysis of data from three bona fide case-control studies that illustrate some important relationships among the three methods. A concluding section discusses the application of two-stage methods to case-control studies with validation subsampling for control of measurement error. © 1997 by John Wiley & Sons, Ltd. 相似文献

9.

Polychotomous logistic regression methods for matched case-control studies with multiple case or control groups

K Y Liang W F Stewart 《American journal of epidemiology》1987,125(4):720-730

Two statistical methods, a polychotomous and pairwise approach, are presented to derive estimates of the relative odds in a matched case-control design when multiple case or control groups are used. Test statistics are derived to determine if the relative odds between groups are different. The polychotomous method is limited to case-control sets, i.e., where data are available on all members of a matched set. In contrast, the pairwise method makes use of data from both complete and incomplete sets. Nonetheless, efficiency calculations show that the polychotomous logistic regression model is more efficient even when 40 per cent of the case-control sets are incomplete. An example using a single dichotomous variable is provided. 相似文献

10.

Logistic regression analysis of biomarker data subject to pooling and dichotomization

Zhang Z Liu A Lyles RH Mukherjee B 《Statistics in medicine》2012,31(22):2473-2484

There is growing interest in pooling specimens across subjects in epidemiologic studies, especially those involving biomarkers. This paper is concerned with regression analysis of epidemiologic data where a binary exposure is subject to pooling and the pooled measurement is dichotomized to indicate either that no subjects in the pool are exposed or that some are exposed, without revealing further information about the exposed subjects in the latter case. The pooling process may be stratified on the disease status (a binary outcome) and possibly other variables but is otherwise assumed random. We propose methods for estimating parameters in a prospective logistic regression model and illustrate these with data from a population-based case-control study of colorectal cancer. Simulation results show that the proposed methods perform reasonably well in realistic settings and that pooling can lead to sizable gains in cost efficiency. We make recommendations with regard to the choice of design for pooled epidemiologic studies. 相似文献

11.

广州地区大学生慢性前列腺炎发病因素的Logistic回归分析 总被引：3，自引：0，他引：3

刘步平方春平黄素芳《现代预防医学》2007,34(18):3482-3483,3493

[目的]探讨广州地区大学生慢性前列腺炎（CP）发病的危险因素。[方法]随机抽取广州4所高校男生中561例CP和561例非CP进行病例对照,用条件Logistic回归法分析CP发病与身体状况、生活习惯、性事等48项有关因素的相关性。[结果]单因素分析表明,喝热茶/水、体育锻炼与CP发生呈负相关,香港脚（脚气）、嗜辛辣物、忍大便、久坐、骑自行车、熬夜、饮酒、重复手淫/性交、忍住不射精、担心手淫/性交后果与CP发生呈正相关。最终引入多因素Logistic逐步回归方程的变量为忍住不射精、重复手淫/性交、担心手淫/性交后果、忍大便、骑自行车。[结论]忍住不射精、重复手淫/性交、担心手淫/性交后果、忍大便、骑自行车是广州地区大学生CP发病的危险因素。相似文献

12.

Two-stage methods for the analysis of pooled data.

T A Stukel E Demidenko J Dykes M R Karagas 《Statistics in medicine》2001,20(14):2115-2130

Epidemiologic studies of disease often produce inconclusive or contradictory results due to small sample sizes or regional variations in the disease incidence or the exposures. To clarify these issues, researchers occasionally pool and reanalyse original data from several large studies. In this paper we explore the use of a two-stage random-effects model for analysing pooled case-control studies and undertake a thorough examination of bias in the pooled estimator under various conditions. The two-stage model analyses each study using the model appropriate to the design with study-specific confounders, and combines the individual study-specific adjusted log-odds ratios using a linear mixed-effects model; it is computationally simple and can incorporate study-level covariates and random effects. Simulations indicate that when the individual studies are large, two-stage methods produce nearly unbiased exposure estimates and standard errors of the exposure estimates from a generalized linear mixed model. By contrast, joint fixed-effects logistic regression produces attenuated exposure estimates and underestimates the standard error when heterogeneity is present. While bias in the pooled regression coefficient increases with interstudy heterogeneity for both models, it is much smaller using the two-stage model. In pooled analyses, where covariates may not be uniformly defined and coded across studies, and occasionally not measured in all studies, a joint model is often not feasible. The two-stage method is shown to be a simple, valid and practical method for the analysis of pooled binary data. The results are applied to a study of reproductive history and cutaneous melanoma risk in women using data from ten large case-control studies. 相似文献

13.

Analysis of matched case-control data with incomplete strata: applying longitudinal approaches

Lin IF Lai MY Chuang PH 《Epidemiology (Cambridge, Mass.)》2007,18(4):446-452

BACKGROUND: Matched case-control data have a structure that is similar to longitudinal data with correlated outcomes, except for a retrospective sampling scheme. In conditional logistic regression analysis, sets that are incomplete due to missing covariates and sets with identical values of the covariates do not contribute to the estimation; both situations may cause a loss in efficiency. These problems are more severe when sample sizes are small. We evaluated retrospective models for longitudinal data as alternatives in analyzing matched case-control data. METHODS: We conducted simulations to compare the properties of matched case-control data analyses using conditional likelihood and a commonly used longitudinal approach generalized estimating equation (GEE). We simulated scenarios for one-to-one and one-to-two matching designs, each with various sizes of matching strata, with complete and incomplete strata, and with dichotomous and normal exposures. RESULTS AND CONCLUSIONS: The simulations show that the estimates by conditional likelihood and GEE methods are consistent, and a proper coverage was reached for both binary and continuous exposures. The estimates produced by conditional likelihood have greater standard errors than those obtained by GEE. These relative efficiency losses are more substantial when data contain incomplete matched sets and when the data have small sizes of matching strata; these can be improved by including more controls in the strata. These losses of efficiency also increase as the magnitude of the association increases. 相似文献

14.

Exploiting gene-environment independence in family-based case-control studies: increased power for detecting associations, interactions and joint effects

Chatterjee N Kalaylioglu Z Carroll RJ 《Genetic epidemiology》2005,28(2):138-156

Family-based case-control studies are popularly used to study the effect of genes and gene-environment interactions in the etiology of rare complex diseases. We consider methods for the analysis of such studies under the assumption that genetic susceptibility (G) and environmental exposures (E) are independently distributed of each other within families in the source population. Conditional logistic regression, the traditional method of analysis of the data, fails to exploit the independence assumption and hence can be inefficient. Alternatively, one can estimate the multiplicative interaction between G and E more efficiently using cases only, but the required population-based G-E independence assumption is very stringent. In this article, we propose a novel conditional likelihood framework for exploiting the within-family G-E independence assumption. This approach leads to a simple and yet highly efficient method of estimating interaction and various other risk parameters of scientific interest. Moreover, we show that the same paradigm also leads to a number of alternative and even more efficient methods for analysis of family-based case-control studies when parental genotype information is available on the case-control study participants. Based on these methods, we evaluate different family-based study designs by examining their relative efficiencies to each other and their efficiencies compared to a population-based case-control design of unrelated subjects. These comparisons reveal important design implications. Extensions of the methodologies for dealing with complex family studies are also discussed. 相似文献

15.

Estimation of interaction effects using pooled biospecimens in a case‐control study

下载免费PDF全文

Michelle R. Danaher Paul S. Albert Aninyda Roy Enrique F. Schisterman 《Statistics in medicine》2016,35(9):1502-1513

相似文献

16.

Logistic回归模型中连续变量交互作用的分析

邱宏余德新谢立亚王晓蓉付振明《中华流行病学杂志》2010,31(1):812-814

Rothman提出生物学交互作用的评价应该基于相加尺度即是否有相加交互作用,而logistic回归模型的乘积项反映的是相乘交互作用.目前国内外文献讨论logistic回归模型中两因素的相加交互作用以两分类变量为主,本文介绍两连续变量或连续变量与分类变量相加交互作用可信区间估计的Bootstrap方法,文中以香港男性肺癌病例对照研究资料为例,辅以免费软件R的实现程序,为研究人员分析交互作用提供参考. 相似文献

17.

Semiparametric regression models for detecting effect modification in matched case-crossover studies

Kim I Cheong HK Kim H 《Statistics in medicine》2011,30(15):1837-1851

In matched case-crossover studies, it is generally accepted that covariates on which a case and associated controls are matched cannot exert a confounding effect on independent predictors included in the conditional logistic regression model because any stratum effect is removed by the conditioning on the fixed number of sets of a case and controls in the stratum. Hence, the conditional logistic regression model is not able to detect any effects associated with the matching covariates by stratum. In addition, the matching covariates may be effect modification and the methods for assessing and characterizing effect modification by matching covariates are quite limited. In this article, we propose a unified approach in its ability to detect both parametric and nonparametric relationships between the predictor and the relative risk of disease or binary outcome, as well as potential effect modifications by matching covariates. Two methods are developed using two semiparametric models: (1) the regression spline varying coefficients model and (2) the regression spline interaction model. Simulation results show that the two approaches are comparable. These methods can be used in any matched case-control study and extend to multilevel effect modification studies. We demonstrate the advantage of our approach using an epidemiological example of a 1-4 bi-directional case-crossover study of childhood aseptic meningitis associated with drinking water turbidity. 相似文献

18.

Logistic回归模型中连续变量交互作用的分析 总被引：1，自引：0，他引：1

下载免费PDF全文

邱宏余德新谢立亚王晓蓉付振明《中华流行病学杂志》2009,31(11):812-814

Rothman提出生物学交互作用的评价应该基于相加尺度即是否有相加交互作用,而logistic回归模型的乘积项反映的是相乘交互作用.目前国内外文献讨论logistic回归模型中两因素的相加交互作用以两分类变量为主,本文介绍两连续变量或连续变量与分类变量相加交互作用可信区间估计的Bootstrap方法,文中以香港男性肺癌病例对照研究资料为例,辅以免费软件R的实现程序,为研究人员分析交互作用提供参考. 相似文献

19.

A weighting approach to causal effects and additive interaction in case-control studies: marginal structural linear odds models

VanderWeele TJ Vansteelandt S 《American journal of epidemiology》2011,174(10):1197-1203

Estimates of additive interaction from case-control data are often obtained by logistic regression; such models can also be used to adjust for covariates. This approach to estimating additive interaction has come under some criticism because of possible misspecification of the logistic model: If the underlying model is linear, the logistic model will be misspecified. The authors propose an inverse probability of treatment weighting approach to causal effects and additive interaction in case-control studies. Under the assumption of no unmeasured confounding, the approach amounts to fitting a marginal structural linear odds model. The approach allows for the estimation of measures of additive interaction between dichotomous exposures, such as the relative excess risk due to interaction, using case-control data without having to rely on modeling assumptions for the outcome conditional on the exposures and covariates. Rather than using conditional models for the outcome, models are instead specified for the exposures conditional on the covariates. The approach is illustrated by assessing additive interaction between genetic and environmental factors using data from a case-control study. 相似文献

20.

Binary regression with continuous outcomes

Samy Suissa Lucie Blais 《Statistics in medicine》1995,14(3):247-255

Clinical research often involves continuous outcome measures, such as blood cholesterol, that are amenable to statistical techniques of analysis based on the mean, such as the t-test or multiple linear regression. Clinical interest, however, frequently focuses on the proportion of subjects who fall below or above a clinically relevant cut-off value, as a measure of the risk of disease. The customary approach to analyse such data is to dichotomize the continuous outcome measure and use statistical techniques based on binary data and the binomial distribution. In this paper, we use a parametric approach and the framework of generalized linear models to fit various regression models, including the logistic, on the basis of the original continuous outcome. We consider the Gaussian and the three-parameter log-normal distributions for the continuous outcome, assessing both precision and bias under various conditions. In simulation analyses, we find that we are unable to fit some of the samples with the ‘dichotomous’ approach, but we can with the ‘continuous’ approach, and that the latter yields estimates between 25 and 85 per cent more efficient than the former. We illustrate the method, programmed using GLIM macros, with data from clinical studies of the risk of hypoxaemia during open thoracic surgery and the risk of nocturnal hypoglycaemia among diabetic children. 相似文献