共查询到20条相似文献,搜索用时 12 毫秒
1.
The performance of different propensity score methods for estimating marginal odds ratios 总被引:1,自引:0,他引:1
Austin PC 《Statistics in medicine》2007,26(16):3078-3094
The propensity score which is the probability of exposure to a specific treatment conditional on observed variables. Conditioning on the propensity score results in unbiased estimation of the expected difference in observed responses to two treatments. In the medical literature, propensity score methods are frequently used for estimating odds ratios. The performance of propensity score methods for estimating marginal odds ratios has not been studied. We performed a series of Monte Carlo simulations to assess the performance of propensity score matching, stratifying on the propensity score, and covariate adjustment using the propensity score to estimate marginal odds ratios. We assessed bias, precision, and mean-squared error (MSE) of the propensity score estimators, in addition to the proportion of bias eliminated due to conditioning on the propensity score. When the true marginal odds ratio was one, then matching on the propensity score and covariate adjustment using the propensity score resulted in unbiased estimation of the true treatment effect, whereas stratification on the propensity score resulted in minor bias in estimating the true marginal odds ratio. When the true marginal odds ratio ranged from 2 to 10, then matching on the propensity score resulted in the least bias, with a relative biases ranging from 2.3 to 13.3 per cent. Stratifying on the propensity score resulted in moderate bias, with relative biases ranging from 15.8 to 59.2 per cent. For both methods, relative bias was proportional to the true odds ratio. Finally, matching on the propensity score tended to result in estimators with the lowest MSE. 相似文献
2.
The two-stage process of propensity score analysis (PSA) includes a design stage where propensity scores (PSs) are estimated and implemented to approximate a randomized experiment and an analysis stage where treatment effects are estimated conditional on the design. This article considers how uncertainty associated with the design stage impacts estimation of causal effects in the analysis stage. Such design uncertainty can derive from the fact that the PS itself is an estimated quantity, but also from other features of the design stage tied to choice of PS implementation. This article offers a procedure for obtaining the posterior distribution of causal effects after marginalizing over a distribution of design-stage outputs, lending a degree of formality to Bayesian methods for PSA that have gained attention in recent literature. Formulation of a probability distribution for the design-stage output depends on how the PS is implemented in the design stage, and propagation of uncertainty into causal estimates depends on how the treatment effect is estimated in the analysis stage. We explore these differences within a sample of commonly used PS implementations (quantile stratification, nearest-neighbor matching, caliper matching, inverse probability of treatment weighting, and doubly robust estimation) and investigate in a simulation study the impact of statistician choice in PS model and implementation on the degree of between- and within-design variability in the estimated treatment effect. The methods are then deployed in an investigation of the association between levels of fine particulate air pollution and elevated exposure to emissions from coal-fired power plants. 相似文献
3.
Peter C. Austin 《Statistics in medicine》2013,32(16):2837-2849
Propensity score methods are increasingly being used to reduce or minimize the effects of confounding when estimating the effects of treatments, exposures, or interventions when using observational or non‐randomized data. Under the assumption of no unmeasured confounders, previous research has shown that propensity score methods allow for unbiased estimation of linear treatment effects (e.g., differences in means or proportions). However, in biomedical research, time‐to‐event outcomes occur frequently. There is a paucity of research into the performance of different propensity score methods for estimating the effect of treatment on time‐to‐event outcomes. Furthermore, propensity score methods allow for the estimation of marginal or population‐average treatment effects. We conducted an extensive series of Monte Carlo simulations to examine the performance of propensity score matching (1:1 greedy nearest‐neighbor matching within propensity score calipers), stratification on the propensity score, inverse probability of treatment weighting (IPTW) using the propensity score, and covariate adjustment using the propensity score to estimate marginal hazard ratios. We found that both propensity score matching and IPTW using the propensity score allow for the estimation of marginal hazard ratios with minimal bias. Of these two approaches, IPTW using the propensity score resulted in estimates with lower mean squared error when estimating the effect of treatment in the treated. Stratification on the propensity score and covariate adjustment using the propensity score result in biased estimation of both marginal and conditional hazard ratios. Applied researchers are encouraged to use propensity score matching and IPTW using the propensity score when estimating the relative effect of treatment on time‐to‐event outcomes. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献
4.
In causal studies without random assignment of treatment, causal effects can be estimated using matched treated and control samples, where matches are obtained using estimated propensity scores. Propensity score matching can reduce bias in treatment effect estimators in cases where the matched samples have overlapping covariate distributions. Despite its application in many applied problems, there is no universally employed approach to interval estimation when using propensity score matching. In this article, we present and evaluate approaches to interval estimation when using propensity score matching. 相似文献
5.
Many observational studies estimate causal effects using methods based on matching on the propensity score. Full matching on the propensity score is an effective and flexible method for utilizing all available data and for creating well‐balanced treatment and control groups. An important component of the full matching algorithm is the decision about whether to impose a restriction on the maximum ratio of controls matched to each treated subject. Despite the possible effect of this restriction on subsequent inferences, this issue has not been examined. We used a series of Monte Carlo simulations to evaluate the effect of imposing a restriction on the maximum ratio of controls matched to each treated subject when estimating risk differences. We considered full matching both with and without a caliper restriction. When using full matching with a caliper restriction, the imposition of a subsequent constraint on the maximum ratio of the number of controls matched to each treated subject had no effect on the quality of inferences. However, when using full matching without a caliper restriction, the imposition of a constraint on the maximum ratio of the number of controls matched to each treated subject tended to result in an increase in bias in the estimated risk difference. However, this increase in bias tended to be accompanied by a corresponding decrease in the sampling variability of the estimated risk difference. We illustrate the consequences of these restrictions using observational data to estimate the effect of medication prescribing on survival following hospitalization for a heart attack. 相似文献
6.
Peter C. Austin 《Statistics in medicine》2014,33(7):1242-1258
Propensity score methods are increasingly being used to estimate causal treatment effects in observational studies. In medical and epidemiological studies, outcomes are frequently time‐to‐event in nature. Propensity‐score methods are often applied incorrectly when estimating the effect of treatment on time‐to‐event outcomes. This article describes how two different propensity score methods (matching and inverse probability of treatment weighting) can be used to estimate the measures of effect that are frequently reported in randomized controlled trials: (i) marginal survival curves, which describe survival in the population if all subjects were treated or if all subjects were untreated; and (ii) marginal hazard ratios. The use of these propensity score methods allows one to replicate the measures of effect that are commonly reported in randomized controlled trials with time‐to‐event outcomes: both absolute and relative reductions in the probability of an event occurring can be determined. We also provide guidance on variable selection for the propensity score model, highlight methods for assessing the balance of baseline covariates between treated and untreated subjects, and describe the implementation of a sensitivity analysis to assess the effect of unmeasured confounding variables on the estimated treatment effect when outcomes are time‐to‐event in nature. The methods in the paper are illustrated by estimating the effect of discharge statin prescribing on the risk of death in a sample of patients hospitalized with acute myocardial infarction. In this tutorial article, we describe and illustrate all the steps necessary to conduct a comprehensive analysis of the effect of treatment on time‐to‐event outcomes. © 2013 The authors. Statistics in Medicine published by John Wiley & Sons, Ltd. 相似文献
7.
Propensity score methods are increasingly being used to estimate causal treatment effects in the medical literature. Conditioning on the propensity score results in unbiased estimation of the expected difference in observed responses to two treatments. The degree to which conditioning on the propensity score introduces bias into the estimation of the conditional odds ratio or conditional hazard ratio, which are frequently used as measures of treatment effect in observational studies, has not been extensively studied. We conducted Monte Carlo simulations to determine the degree to which propensity score matching, stratification on the quintiles of the propensity score, and covariate adjustment using the propensity score result in biased estimation of conditional odds ratios, hazard ratios, and rate ratios. We found that conditioning on the propensity score resulted in biased estimation of the true conditional odds ratio and the true conditional hazard ratio. In all scenarios examined, treatment effects were biased towards the null treatment effect. However, conditioning on the propensity score did not result in biased estimation of the true conditional rate ratio. In contrast, conventional regression methods allowed unbiased estimation of the true conditional treatment effect when all variables associated with the outcome were included in the regression model. The observed bias in propensity score methods is due to the fact that regression models allow one to estimate conditional treatment effects, whereas propensity score methods allow one to estimate marginal treatment effects. In several settings with non-linear treatment effects, marginal and conditional treatment effects do not coincide. 相似文献
8.
Wei Yang Marshall M. Joffe Sean Hennessy Harold I. Feldman 《Statistics in medicine》2014,33(26):4577-4589
Propensity scores are widely used to control for confounding when estimating the effect of a binary treatment in observational studies. They have been generalized to ordinal and continuous treatments in the recent literature. Following the definition of propensity function and its parameterizations (called the propensity parameter in this paper) proposed by Imai and van Dyk, we explore sufficient conditions for selecting propensity parameters to control for confounding for continuous treatments in the context of regression‐based adjustment in linear models. Typically, investigators make parametric assumptions about the form of the dose–response function for a continuous treatment. Such assumptions often allow the analyst to use only a subset of the propensity parameters to control confounding. When the treatment is the only predictor in the structural, that is, causal model, it is sufficient to adjust only for the propensity parameters that characterize the expectation of the treatment variable or its functional form. When the structural model includes selected baseline covariates other than the treatment variable, those baseline covariates, in addition to the propensity parameters, must also be adjusted in the model. We demonstrate these points with an example estimating the dose–response relationship for the effect of erythropoietin on hematocrit level in patients with end‐stage renal disease. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献
9.
Jessica M. Franklin Wesley Eddings Peter C. Austin Elizabeth A. Stuart Sebastian Schneeweiss 《Statistics in medicine》2017,36(12):1946-1963
Nonrandomized studies of treatments from electronic healthcare databases are critical for producing the evidence necessary to making informed treatment decisions, but often rely on comparing rates of events observed in a small number of patients. In addition, studies constructed from electronic healthcare databases, for example, administrative claims data, often adjust for many, possibly hundreds, of potential confounders. Despite the importance of maximizing efficiency when there are many confounders and few observed outcome events, there has been relatively little research on the relative performance of different propensity score methods in this context. In this paper, we compare a wide variety of propensity‐based estimators of the marginal relative risk. In contrast to prior research that has focused on specific statistical methods in isolation of other analytic choices, we instead consider a method to be defined by the complete multistep process from propensity score modeling to final treatment effect estimation. Propensity score model estimation methods considered include ordinary logistic regression, Bayesian logistic regression, lasso, and boosted regression trees. Methods for utilizing the propensity score include pair matching, full matching, decile strata, fine strata, regression adjustment using one or two nonlinear splines, inverse propensity weighting, and matching weights. We evaluate methods via a ‘plasmode’ simulation study, which creates simulated datasets on the basis of a real cohort study of two treatments constructed from administrative claims data. Our results suggest that regression adjustment and matching weights, regardless of the propensity score model estimation method, provide lower bias and mean squared error in the context of rare binary outcomes. Copyright © 2017 John Wiley & Sons, Ltd. 相似文献
10.
Peter C. Austin 《Statistics in medicine》2010,29(20):2137-2148
Propensity score methods are increasingly being used to estimate the effects of treatments on health outcomes using observational data. There are four methods for using the propensity score to estimate treatment effects: covariate adjustment using the propensity score, stratification on the propensity score, propensity‐score matching, and inverse probability of treatment weighting (IPTW) using the propensity score. When outcomes are binary, the effect of treatment on the outcome can be described using odds ratios, relative risks, risk differences, or the number needed to treat. Several clinical commentators suggested that risk differences and numbers needed to treat are more meaningful for clinical decision making than are odds ratios or relative risks. However, there is a paucity of information about the relative performance of the different propensity‐score methods for estimating risk differences. We conducted a series of Monte Carlo simulations to examine this issue. We examined bias, variance estimation, coverage of confidence intervals, mean‐squared error (MSE), and type I error rates. A doubly robust version of IPTW had superior performance compared with the other propensity‐score methods. It resulted in unbiased estimation of risk differences, treatment effects with the lowest standard errors, confidence intervals with the correct coverage rates, and correct type I error rates. Stratification, matching on the propensity score, and covariate adjustment using the propensity score resulted in minor to modest bias in estimating risk differences. Estimators based on IPTW had lower MSE compared with other propensity‐score methods. Differences between IPTW and propensity‐score matching may reflect that these two methods estimate the average treatment effect and the average treatment effect for the treated, respectively. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献
11.
12.
Peter C. Austin 《Statistics in medicine》2014,33(6):1057-1069
Propensity‐score matching is increasingly being used to reduce the confounding that can occur in observational studies examining the effects of treatments or interventions on outcomes. We used Monte Carlo simulations to examine the following algorithms for forming matched pairs of treated and untreated subjects: optimal matching, greedy nearest neighbor matching without replacement, and greedy nearest neighbor matching without replacement within specified caliper widths. For each of the latter two algorithms, we examined four different sub‐algorithms defined by the order in which treated subjects were selected for matching to an untreated subject: lowest to highest propensity score, highest to lowest propensity score, best match first, and random order. We also examined matching with replacement. We found that (i) nearest neighbor matching induced the same balance in baseline covariates as did optimal matching; (ii) when at least some of the covariates were continuous, caliper matching tended to induce balance on baseline covariates that was at least as good as the other algorithms; (iii) caliper matching tended to result in estimates of treatment effect with less bias compared with optimal and nearest neighbor matching; (iv) optimal and nearest neighbor matching resulted in estimates of treatment effect with negligibly less variability than did caliper matching; (v) caliper matching had amongst the best performance when assessed using mean squared error; (vi) the order in which treated subjects were selected for matching had at most a modest effect on estimation; and (vii) matching with replacement did not have superior performance compared with caliper matching without replacement. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. 相似文献
13.
David Hajage Yann De Rycke Guillaume Chauvet Florence Tubach 《Statistics in medicine》2017,36(4):687-716
Introduced by Hansen in 2008, the prognostic score (PGS) has been presented as ‘the prognostic analogue of the propensity score’ (PPS). PPS‐based methods are intended to estimate marginal effects. Most previous studies evaluated the performance of existing PGS‐based methods (adjustment, stratification and matching using the PGS) in situations in which the theoretical conditional and marginal effects are equal (i.e., collapsible situations). To support the use of PGS framework as an alternative to the PPS framework, applied researchers must have reliable information about the type of treatment effect estimated by each method. We propose four new PGS‐based methods, each developed to estimate a specific type of treatment effect. We evaluated the ability of existing and new PGS‐based methods to estimate the conditional treatment effect (CTE), the (marginal) average treatment effect on the whole population (ATE), and the (marginal) average treatment effect on the treated population (ATT), when the odds ratio (a non‐collapsible estimator) is the measure of interest. The performance of PGS‐based methods was assessed by Monte Carlo simulations and compared with PPS‐based methods and multivariate regression analysis. Existing PGS‐based methods did not allow for estimating the ATE and showed unacceptable performance when the proportion of exposed subjects was large. When estimating marginal effects, PPS‐based methods were too conservative, whereas the new PGS‐based methods performed better with low prevalence of exposure, and had coverages closer to the nominal value. When estimating CTE, the new PGS‐based methods performed as well as traditional multivariate regression. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献
14.
There is an increasing interest in using administrative data to estimate the treatment effects of interventions. While administrative data are relatively inexpensive to obtain and provide population coverage, they are frequently characterized by lack of clinical detail, often leading to problematic confounding when they are used to conduct observational research. Propensity score methods are increasingly being used to address confounding in estimating the effects of interventions in such studies. Using data on patients discharged from hospital for whom both administrative data and detailed clinical data obtained from chart reviews were available, we examined the degree to which stratifying on the quintiles of propensity scores derived from administrative data was able to balance patient characteristics measured in clinical data. We also determined the extent to which measures of treatment effect obtained using propensity score methods were similar to those obtained using traditional regression methods. As a test case, we examined the treatment effects of ASA and beta-blockers following acute myocardial infarction. We demonstrated that propensity scores developed using administrative data do not necessarily balance patient characteristics contained in clinical data. Furthermore, measures of treatment effectiveness were attenuated when obtained using clinical data compared to when administrative data were used. 相似文献
15.
Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies 下载免费PDF全文
The propensity score is defined as a subject's probability of treatment selection, conditional on observed baseline covariates. Weighting subjects by the inverse probability of treatment received creates a synthetic sample in which treatment assignment is independent of measured baseline covariates. Inverse probability of treatment weighting (IPTW) using the propensity score allows one to obtain unbiased estimates of average treatment effects. However, these estimates are only valid if there are no residual systematic differences in observed baseline characteristics between treated and control subjects in the sample weighted by the estimated inverse probability of treatment. We report on a systematic literature review, in which we found that the use of IPTW has increased rapidly in recent years, but that in the most recent year, a majority of studies did not formally examine whether weighting balanced measured covariates between treatment groups. We then proceed to describe a suite of quantitative and qualitative methods that allow one to assess whether measured baseline covariates are balanced between treatment groups in the weighted sample. The quantitative methods use the weighted standardized difference to compare means, prevalences, higher‐order moments, and interactions. The qualitative methods employ graphical methods to compare the distribution of continuous baseline covariates between treated and control subjects in the weighted sample. Finally, we illustrate the application of these methods in an empirical case study. We propose a formal set of balance diagnostics that contribute towards an evolving concept of ‘best practice’ when using IPTW to estimate causal treatment effects using observational data. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. 相似文献
16.
Propensity score and doubly robust methods for estimating the effect of treatment on censored cost 下载免费PDF全文
Jiaqi Li Elizabeth Handorf Justin Bekelman Nandita Mitra 《Statistics in medicine》2016,35(12):1985-1999
The estimation of treatment effects on medical costs is complicated by the need to account for informative censoring, skewness, and the effects of confounders. Because medical costs are often collected from observational claims data, we investigate propensity score (PS) methods such as covariate adjustment, stratification, and inverse probability weighting taking into account informative censoring of the cost outcome. We compare these more commonly used methods with doubly robust (DR) estimation. We then use a machine learning approach called super learner (SL) to choose among conventional cost models to estimate regression parameters in the DR approach and to choose among various model specifications for PS estimation. Our simulation studies show that when the PS model is correctly specified, weighting and DR perform well. When the PS model is misspecified, the combined approach of DR with SL can still provide unbiased estimates. SL is especially useful when the underlying cost distribution comes from a mixture of different distributions or when the true PS model is unknown. We apply these approaches to a cost analysis of two bladder cancer treatments, cystectomy versus bladder preservation therapy, using SEER‐Medicare data. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献
17.
The propensity score--the probability of exposure to a specific treatment conditional on observed variables--is increasingly being used in observational studies. Creating strata in which subjects are matched on the propensity score allows one to balance measured variables between treated and untreated subjects. There is an ongoing controversy in the literature as to which variables to include in the propensity score model. Some advocate including those variables that predict treatment assignment, while others suggest including all variables potentially related to the outcome, and still others advocate including only variables that are associated with both treatment and outcome. We provide a case study of the association between drug exposure and mortality to show that including a variable that is related to treatment, but not outcome, does not improve balance and reduces the number of matched pairs available for analysis. In order to investigate this issue more comprehensively, we conducted a series of Monte Carlo simulations of the performance of propensity score models that contained variables related to treatment allocation, or variables that were confounders for the treatment-outcome pair, or variables related to outcome or all variables related to either outcome or treatment or neither. We compared the use of these different propensity scores models in matching and stratification in terms of the extent to which they balanced variables. We demonstrated that all propensity scores models balanced measured confounders between treated and untreated subjects in a propensity-score matched sample. However, including only the true confounders or the variables predictive of the outcome in the propensity score model resulted in a substantially larger number of matched pairs than did using the treatment-allocation model. Stratifying on the quintiles of any propensity score model resulted in residual imbalance between treated and untreated subjects in the upper and lower quintiles. Greater balance between treated and untreated subjects was obtained after matching on the propensity score than after stratifying on the quintiles of the propensity score. When a confounding variable was omitted from any of the propensity score models, then matching or stratifying on the propensity score resulted in residual imbalance in prognostically important variables between treated and untreated subjects. We considered four propensity score models for estimating treatment effects: the model that included only true confounders; the model that included all variables associated with the outcome; the model that included all measured variables; and the model that included all variables associated with treatment selection. Reduction in bias when estimating a null treatment effect was equivalent for all four propensity score models when propensity score matching was used. Reduction in bias was marginally greater for the first two propensity score models than for the last two propensity score models when stratification on the quintiles of the propensity score model was employed. Furthermore, omitting a confounding variable from the propensity score model resulted in biased estimation of the treatment effect. Finally, the mean squared error for estimating a null treatment effect was lower when either of the first two propensity scores was used compared to when either of the last two propensity score models was used. 相似文献
18.
Propensity-score matching is a popular analytic method to estimate the effects of treatments when using observational data. Matching on the propensity score typically requires a pool of potential controls that is larger than the number of treated or exposed subjects. The most common approach to matching on the propensity score is matching without replacement, in which each control subject is matched to at most one treated subject. Failure to find a matched control for each treated subject can lead to “bias due to incomplete matching.” To avoid this bias, it is important to identify a matched control subject for each treated subject. An alternative to matching without replacement is matching with replacement, in which control subjects are allowed to be matched to multiple treated subjects. A limitation to the use of matching with replacement is that variance estimation must account for both the matched nature of the sample and for some control subjects being included in multiple matched sets. While a variance estimator has been proposed for when outcomes are continuous, no such estimator has been proposed for use with time-to-event outcomes, which are common in medical and epidemiological research. We propose a variance estimator for the hazard ratio when matching with replacement. We conducted a series of Monte Carlo simulations to examine the performance of this estimator. We illustrate the utility of matching with replacement to estimate the effect of smoking cessation counseling on survival in smokers discharged from hospital with a heart attack. 相似文献
19.
There is an increasing interest in the use of propensity score methods to estimate causal effects in observational studies. However, recent systematic reviews have demonstrated that propensity score methods are inconsistently used and frequently poorly applied in the medical literature. In this study, we compared the following propensity score methods for estimating the reduction in all-cause mortality due to statin therapy for patients hospitalized with acute myocardial infarction: propensity-score matching, stratification using the propensity score, covariate adjustment using the propensity score, and weighting using the propensity score. We used propensity score methods to estimate both adjusted treated effects and the absolute and relative risk reduction in all-cause mortality. We also examined the use of statistical hypothesis testing, standardized differences, box plots, non-parametric density estimates, and quantile-quantile plots to assess residual confounding that remained after stratification or matching on the propensity score. Estimates of the absolute reduction in 3-year mortality ranged from 2.1 to 4.5 per cent, while estimates of the relative risk reduction ranged from 13.3 to 17.0 per cent. Adjusted estimates of the reduction in the odds of 3-year death varied from 15 to 24 per cent across the different propensity score methods. 相似文献
20.
Clémence Leyrat Agnès Caille Allan Donner Bruno Giraudeau 《Statistics in medicine》2014,33(20):3556-3575
Despite randomization, selection bias may occur in cluster randomized trials. Classical multivariable regression usually allows for adjusting treatment effect estimates with unbalanced covariates. However, for binary outcomes with low incidence, such a method may fail because of separation problems. This simulation study focused on the performance of propensity score (PS)‐based methods to estimate relative risks from cluster randomized trials with binary outcomes with low incidence. The results suggested that among the different approaches used (multivariable regression, direct adjustment on PS, inverse weighting on PS, and stratification on PS), only direct adjustment on the PS fully corrected the bias and moreover had the best statistical properties. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献