首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The aim of this study was to use Monte Carlo simulations to compare logistic regression with propensity scores in terms of bias, precision, empirical coverage probability, empirical power, and robustness when the number of events is low relative to the number of confounders. The authors simulated a cohort study and performed 252,480 trials. In the logistic regression, the bias decreased as the number of events per confounder increased. In the propensity score, the bias decreased as the strength of the association of the exposure with the outcome increased. Propensity scores produced estimates that were less biased, more robust, and more precise than the logistic regression estimates when there were seven or fewer events per confounder. The logistic regression empirical coverage probability increased as the number of events per confounder increased. The propensity score empirical coverage probability decreased after eight or more events per confounder. Overall, the propensity score exhibited more empirical power than logistic regression. Propensity scores are a good alternative to control for imbalances when there are seven or fewer events per confounder; however, empirical power could range from 35% to 60%. Logistic regression is the technique of choice when there are at least eight events per confounder.  相似文献   

2.
In investigations of the effect of treatment on outcome, the propensity score is a tool to eliminate imbalance in the distribution of confounding variables between treatment groups. Recent work has suggested that Super Learner, an ensemble method, outperforms logistic regression in nonlinear settings; however, experience with real-data analyses tends to show overfitting of the propensity score model using this approach. We investigated a wide range of simulated settings of varying complexities including simulations based on real data to compare the performances of logistic regression, generalized boosted models, and Super Learner in providing balance and for estimating the average treatment effect via propensity score regression, propensity score matching, and inverse probability of treatment weighting. We found that Super Learner and logistic regression are comparable in terms of covariate balance, bias, and mean squared error (MSE); however, Super Learner is computationally very expensive thus leaving no clear advantage to the more complex approach. Propensity scores estimated by generalized boosted models were inferior to the other two estimation approaches. We also found that propensity score regression adjustment was superior to either matching or inverse weighting when the form of the dependence on the treatment on the outcome is correctly specified.  相似文献   

3.
Nonrandomized studies of treatments from electronic healthcare databases are critical for producing the evidence necessary to making informed treatment decisions, but often rely on comparing rates of events observed in a small number of patients. In addition, studies constructed from electronic healthcare databases, for example, administrative claims data, often adjust for many, possibly hundreds, of potential confounders. Despite the importance of maximizing efficiency when there are many confounders and few observed outcome events, there has been relatively little research on the relative performance of different propensity score methods in this context. In this paper, we compare a wide variety of propensity‐based estimators of the marginal relative risk. In contrast to prior research that has focused on specific statistical methods in isolation of other analytic choices, we instead consider a method to be defined by the complete multistep process from propensity score modeling to final treatment effect estimation. Propensity score model estimation methods considered include ordinary logistic regression, Bayesian logistic regression, lasso, and boosted regression trees. Methods for utilizing the propensity score include pair matching, full matching, decile strata, fine strata, regression adjustment using one or two nonlinear splines, inverse propensity weighting, and matching weights. We evaluate methods via a ‘plasmode’ simulation study, which creates simulated datasets on the basis of a real cohort study of two treatments constructed from administrative claims data. Our results suggest that regression adjustment and matching weights, regardless of the propensity score model estimation method, provide lower bias and mean squared error in the context of rare binary outcomes. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

4.
Propensity scores have been used widely as a bias reduction method to estimate the treatment effect in nonrandomized studies. Since many covariates are generally included in the model for estimating the propensity scores, the proportion of subjects with at least one missing covariate could be large. While many methods have been proposed for propensity score‐based estimation in the presence of missing covariates, little has been published comparing the performance of these methods. In this article we propose a novel method called multiple imputation missingness pattern (MIMP) and compare it with the naive estimator (ignoring propensity score) and three commonly used methods of handling missing covariates in propensity score‐based estimation (separate estimation of propensity scores within each pattern of missing data, multiple imputation and discarding missing data) under different mechanisms of missing data and degree of correlation among covariates. Simulation shows that all adjusted estimators are much less biased than the naive estimator. Under certain conditions MIMP provides benefits (smaller bias and mean‐squared error) compared with existing alternatives. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

5.
Logistic regression is one of the most widely used regression models in practice, but alternatives to conventional maximum likelihood estimation methods may be more appropriate for small or sparse samples. Modification of the logistic regression score function to remove first-order bias is equivalent to penalizing the likelihood by the Jeffreys prior, and yields penalized maximum likelihood estimates (PLEs) that always exist, even in samples in which maximum likelihood estimates (MLEs) are infinite. PLEs are an attractive alternative in small-to-moderate-sized samples, and are preferred to exact conditional MLEs when there are continuous covariates. We present methods to construct confidence intervals (CI) in the penalized multinomial logistic regression model, and compare CI coverage and length for the PLE-based methods to that of conventional MLE-based methods in trinomial logistic regressions with both binary and continuous covariates. Based on simulation studies in sparse data sets, we recommend profile CIs over asymptotic Wald-type intervals for the PLEs in all cases. Furthermore, when finite sample bias and data separation are likely to occur, we prefer PLE profile CIs over MLE methods.  相似文献   

6.
We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case‐control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non‐linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

7.
Propensity score matching is often used in observational studies to create treatment and control groups with similar distributions of observed covariates. Typically, propensity scores are estimated using logistic regressions that assume linearity between the logistic link and the predictors. We evaluate the use of generalized additive models (GAMs) for estimating propensity scores. We compare logistic regressions and GAMs in terms of balancing covariates using simulation studies with artificial and genuine data. We find that, when the distributions of covariates in the treatment and control groups overlap sufficiently, using GAMs can improve overall covariate balance, especially for higher-order moments of distributions. When the distributions in the two groups overlap insufficiently, GAM more clearly reveals this fact than logistic regression does. We also demonstrate via simulation that matching with GAMs can result in larger reductions in bias when estimating treatment effects than matching with logistic regression.  相似文献   

8.
Propensity and prognostic score methods seek to improve the quality of causal inference in non‐randomized or observational studies by replicating the conditions found in a controlled experiment, at least with respect to observed characteristics. Propensity scores model receipt of the treatment of interest; prognostic scores model the potential outcome under a single treatment condition. While the popularity of propensity score methods continues to grow, prognostic score methods and methods combining propensity and prognostic scores have thus far received little attention. To this end, we performed a simulation study that compared subclassification and full matching on a single estimated propensity or prognostic score with three approaches combining the estimated propensity and prognostic scores: full matching on a Mahalanobis distance combining the estimated propensity and prognostic scores (FULL–MAHAL); full matching on the estimated prognostic propensity score within propensity score calipers (FULL–PGPPTY); and subclassification on an estimated propensity and prognostic score grid with 5 × 5 subclasses (SUBCLASS(5*5)). We considered settings in which one, both, or neither score model was misspecified. The data generating mechanisms varied in the degree of linearity and additivity in the true treatment assignment and outcome models. FULL–MAHAL and FULL–PGPPTY exhibited strong to superior performance in root mean square error terms across all simulation settings and scenarios. Methods combining propensity and prognostic scores were no less robust to model misspecification than single‐score methods even when both score models were incorrectly specified. Our findings support the joint use of propensity and prognostic scores in estimation of the average treatment effect on the treated. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

9.
The propensity score which is the probability of exposure to a specific treatment conditional on observed variables. Conditioning on the propensity score results in unbiased estimation of the expected difference in observed responses to two treatments. In the medical literature, propensity score methods are frequently used for estimating odds ratios. The performance of propensity score methods for estimating marginal odds ratios has not been studied. We performed a series of Monte Carlo simulations to assess the performance of propensity score matching, stratifying on the propensity score, and covariate adjustment using the propensity score to estimate marginal odds ratios. We assessed bias, precision, and mean-squared error (MSE) of the propensity score estimators, in addition to the proportion of bias eliminated due to conditioning on the propensity score. When the true marginal odds ratio was one, then matching on the propensity score and covariate adjustment using the propensity score resulted in unbiased estimation of the true treatment effect, whereas stratification on the propensity score resulted in minor bias in estimating the true marginal odds ratio. When the true marginal odds ratio ranged from 2 to 10, then matching on the propensity score resulted in the least bias, with a relative biases ranging from 2.3 to 13.3 per cent. Stratifying on the propensity score resulted in moderate bias, with relative biases ranging from 15.8 to 59.2 per cent. For both methods, relative bias was proportional to the true odds ratio. Finally, matching on the propensity score tended to result in estimators with the lowest MSE.  相似文献   

10.
The use of propensity score methods to adjust for selection bias in observational studies has become increasingly popular in public health and medical research. A substantial portion of studies using propensity score adjustment treat the propensity score as a conventional regression predictor. Through a Monte Carlo simulation study, Austin and colleagues. investigated the bias associated with treatment effect estimation when the propensity score is used as a covariate in nonlinear regression models, such as logistic regression and Cox proportional hazards models. We show that the bias exists even in a linear regression model when the estimated propensity score is used and derive the explicit form of the bias. We also conduct an extensive simulation study to compare the performance of such covariate adjustment with propensity score stratification, propensity score matching, inverse probability of treatment weighted method, and nonparametric functional estimation using splines. The simulation scenarios are designed to reflect real data analysis practice. Instead of specifying a known parametric propensity score model, we generate the data by considering various degrees of overlap of the covariate distributions between treated and control groups. Propensity score matching excels when the treated group is contained within a larger control pool, while the model‐based adjustment may have an edge when treated and control groups do not have too much overlap. Overall, adjusting for the propensity score through stratification or matching followed by regression or using splines, appears to be a good practical strategy. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

11.
Confounder-adjusted estimates of the risk difference are often difficult to obtain by direct regression adjustment. Estimates can be obtained from a propensity score-based method using inverse probability-of-exposure weights to balance groups defined by exposure status with respect to confounders. Simulation was used to evaluate the performance of this method. The simulation model incorporated a binary confounder and a normally distributed confounder into logistic models of exposure status, and disease status conditional on exposure status. Data were generated for combinations of values of several design parameters, including the odds ratio relating each of the confounders to exposure status, the odds ratio relating each of the confounders to disease status and the total sample size. For most design parameter combinations (474 of 486), the absolute bias in the estimated risk difference was less than 1 percentage point, and it was never greater than 3 percentage points. The confidence interval generally had close to nominal 95 per cent coverage, but was prone to poor coverage levels (as low as 78.5 per cent) when both the confounder-to-exposure and confounder-to-outcome odds ratios were 5, consistent with strong confounding. The simulation results showed that the conditions that are favourable for good performance of the weighting method are: reasonable overlap in the propensity score distributions of the exposed and non-exposed groups and a large sample size.  相似文献   

12.
ObjectivePropensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy.Study Design and SettingWe identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use.ResultsWe identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting).ConclusionAlthough the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice.  相似文献   

13.
The propensity score is a subject's probability of treatment, conditional on observed baseline covariates. Conditional on the true propensity score, treated and untreated subjects have similar distributions of observed baseline covariates. Propensity‐score matching is a popular method of using the propensity score in the medical literature. Using this approach, matched sets of treated and untreated subjects with similar values of the propensity score are formed. Inferences about treatment effect made using propensity‐score matching are valid only if, in the matched sample, treated and untreated subjects have similar distributions of measured baseline covariates. In this paper we discuss the following methods for assessing whether the propensity score model has been correctly specified: comparing means and prevalences of baseline characteristics using standardized differences; ratios comparing the variance of continuous covariates between treated and untreated subjects; comparison of higher order moments and interactions; five‐number summaries; and graphical methods such as quantile–quantile plots, side‐by‐side boxplots, and non‐parametric density plots for comparing the distribution of baseline covariates between treatment groups. We describe methods to determine the sampling distribution of the standardized difference when the true standardized difference is equal to zero, thereby allowing one to determine the range of standardized differences that are plausible with the propensity score model having been correctly specified. We highlight the limitations of some previously used methods for assessing the adequacy of the specification of the propensity‐score model. In particular, methods based on comparing the distribution of the estimated propensity score between treated and untreated subjects are uninformative. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

14.
Propensity score methods are increasingly being used to estimate the effects of treatments on health outcomes using observational data. There are four methods for using the propensity score to estimate treatment effects: covariate adjustment using the propensity score, stratification on the propensity score, propensity‐score matching, and inverse probability of treatment weighting (IPTW) using the propensity score. When outcomes are binary, the effect of treatment on the outcome can be described using odds ratios, relative risks, risk differences, or the number needed to treat. Several clinical commentators suggested that risk differences and numbers needed to treat are more meaningful for clinical decision making than are odds ratios or relative risks. However, there is a paucity of information about the relative performance of the different propensity‐score methods for estimating risk differences. We conducted a series of Monte Carlo simulations to examine this issue. We examined bias, variance estimation, coverage of confidence intervals, mean‐squared error (MSE), and type I error rates. A doubly robust version of IPTW had superior performance compared with the other propensity‐score methods. It resulted in unbiased estimation of risk differences, treatment effects with the lowest standard errors, confidence intervals with the correct coverage rates, and correct type I error rates. Stratification, matching on the propensity score, and covariate adjustment using the propensity score resulted in minor to modest bias in estimating risk differences. Estimators based on IPTW had lower MSE compared with other propensity‐score methods. Differences between IPTW and propensity‐score matching may reflect that these two methods estimate the average treatment effect and the average treatment effect for the treated, respectively. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

15.
In observational studies, investigators have no control over the treatment assignment. The treated and non-treated (that is, control) groups may have large differences on their observed covariates, and these differences can lead to biased estimates of treatment effects. Even traditional covariance analysis adjustments may be inadequate to eliminate this bias. The propensity score, defined as the conditional probability of being treated given the covariates, can be used to balance the covariates in the two groups, and therefore reduce this bias. In order to estimate the propensity score, one must model the distribution of the treatment indicator variable given the observed covariates. Once estimated the propensity score can be used to reduce bias through matching, stratification (subclassification), regression adjustment, or some combination of all three. In this tutorial we discuss the uses of propensity score methods for bias reduction, give references to the literature and illustrate the uses through applied examples. © 1998 John Wiley & Sons, Ltd.  相似文献   

16.
Propensity scores are widely used in cohort studies to improve performance of regression models when considering large numbers of covariates. Another type of summary score, the disease risk score (DRS), which estimates disease probability conditional on nonexposure, has also been suggested. However, little is known about how it compares with propensity scores. Monte Carlo simulations were conducted comparing regression models using the DRS and the propensity score with models that directly adjust for all of the individual covariates. The DRS was calculated in 2 ways: from the unexposed population and from the full cohort. Compared with traditional multivariable outcome regression models, all 3 summary scores had comparable performance for moderate correlation between exposure and covariates and, for strong correlation, the full-cohort DRS and propensity score had comparable performance. When traditional methods had model misspecification, propensity scores and the full-cohort DRS had superior performance. All 4 models were affected by the number of events per covariate, with propensity scores and traditional multivariable outcome regression least affected. These data suggest that, for cohort studies for which covariates are not highly correlated with exposure, the DRS, particularly that calculated from the full cohort, is a useful tool.  相似文献   

17.
Despite randomization, selection bias may occur in cluster randomized trials. Classical multivariable regression usually allows for adjusting treatment effect estimates with unbalanced covariates. However, for binary outcomes with low incidence, such a method may fail because of separation problems. This simulation study focused on the performance of propensity score (PS)‐based methods to estimate relative risks from cluster randomized trials with binary outcomes with low incidence. The results suggested that among the different approaches used (multivariable regression, direct adjustment on PS, inverse weighting on PS, and stratification on PS), only direct adjustment on the PS fully corrected the bias and moreover had the best statistical properties. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

18.
While propensity score weighting has been shown to reduce bias in treatment effect estimation when selection bias is present, it has also been shown that such weighting can perform poorly if the estimated propensity score weights are highly variable. Various approaches have been proposed which can reduce the variability of the weights and the risk of poor performance, particularly those based on machine learning methods. In this study, we closely examine approaches to fine-tune one machine learning technique [generalized boosted models (GBM)] to select propensity scores that seek to optimize the variance-bias trade-off that is inherent in most propensity score analyses. Specifically, we propose and evaluate three approaches for selecting the optimal number of trees for the GBM in the twang package in R. Normally, the twang package in R iteratively selects the optimal number of trees as that which maximizes balance between the treatment groups being considered. Because the selected number of trees may lead to highly variable propensity score weights, we examine alternative ways to tune the number of trees used in the estimation of propensity score weights such that we sacrifice some balance on the pre-treatment covariates in exchange for less variable weights. We use simulation studies to illustrate these methods and to describe the potential advantages and disadvantages of each method. We apply these methods to two case studies: one examining the effect of dog ownership on the owner’s general health using data from a large, population-based survey in California, and a second investigating the relationship between abstinence and a long-term economic outcome among a sample of high-risk youth.  相似文献   

19.
The propensity adjustment is used to reduce bias in treatment effectiveness estimates from observational data. We show here that a mixed-effects implementation of the propensity adjustment can reduce bias in longitudinal studies of non-equivalent comparison groups. The strategy examined here involves two stages. Initially, a mixed-effects ordinal logistic regression model of propensity for treatment intensity includes variables that differentiate subjects who receive various doses of time-varying treatments. Second, a mixed-effects linear regression model compares the effectiveness of those ordinal doses on a continuous outcome over time. Here, a simulation study compares bias reduction that is achieved by implementing this propensity adjustment through various forms of stratification. The simulations demonstrate that bias decreased monotonically as the number of quantiles used for stratification increased from two to five. This was particularly pronounced with stronger effects of the confounding variables. The quartile and quintile strategies typically removed in excess of 80-90 per cent of the bias detected in unadjusted models; whereas a median-split approach removed from 20 to 45 per cent of bias. The approach is illustrated in an evaluation of the effectiveness of somatic treatments for major depression in a longitudinal, observational study of affective disorders.  相似文献   

20.
Propensity-score methods are increasingly being used to reduce the impact of treatment-selection bias in the estimation of treatment effects using observational data. Commonly used propensity-score methods include covariate adjustment using the propensity score, stratification on the propensity score, and propensity-score matching. Empirical and theoretical research has demonstrated that matching on the propensity score eliminates a greater proportion of baseline differences between treated and untreated subjects than does stratification on the propensity score. However, the analysis of propensity-score-matched samples requires statistical methods appropriate for matched-pairs data. We critically evaluated 47 articles that were published between 1996 and 2003 in the medical literature and that employed propensity-score matching. We found that only two of the articles reported the balance of baseline characteristics between treated and untreated subjects in the matched sample and used correct statistical methods to assess the degree of imbalance. Thirteen (28 per cent) of the articles explicitly used statistical methods appropriate for the analysis of matched data when estimating the treatment effect and its statistical significance. Common errors included using the log-rank test to compare Kaplan-Meier survival curves in the matched sample, using Cox regression, logistic regression, chi-squared tests, t-tests, and Wilcoxon rank sum tests in the matched sample, thereby failing to account for the matched nature of the data. We provide guidelines for the analysis and reporting of studies that employ propensity-score matching.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号