首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Two‐stage instrumental variable methods are commonly used to estimate the causal effects of treatments on survival in the presence of measured and unmeasured confounding. Two‐stage residual inclusion (2SRI) has been the method of choice over two‐stage predictor substitution (2SPS) in clinical studies. We directly compare the bias in the causal hazard ratio estimated by these two methods. Under a principal stratification framework, we derive a closed‐form solution for asymptotic bias of the causal hazard ratio among compliers for both the 2SPS and 2SRI methods when survival time follows the Weibull distribution with random censoring. When there is no unmeasured confounding and no always takers, our analytic results show that 2SRI is generally asymptotically unbiased, but 2SPS is not. However, when there is substantial unmeasured confounding, 2SPS performs better than 2SRI with respect to bias under certain scenarios. We use extensive simulation studies to confirm the analytic results from our closed‐form solutions. We apply these two methods to prostate cancer treatment data from Surveillance, Epidemiology and End Results‐Medicare and compare these 2SRI and 2SPS estimates with results from two published randomized trials. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

2.
Unmeasured confounding is a common concern when researchers attempt to estimate a treatment effect using observational data or randomized studies with nonperfect compliance. To address this concern, instrumental variable methods, such as 2‐stage predictor substitution (2SPS) and 2‐stage residual inclusion (2SRI), have been widely adopted. In many clinical studies of binary and survival outcomes, 2SRI has been accepted as the method of choice over 2SPS, but a compelling theoretical rationale has not been postulated. We evaluate the bias and consistency in estimating the conditional treatment effect for both 2SPS and 2SRI when the outcome is binary, count, or time to event. We demonstrate analytically that the bias in 2SPS and 2SRI estimators can be reframed to mirror the problem of omitted variables in nonlinear models and that there is a direct relationship with the collapsibility of effect measures. In contrast to conclusions made by previous studies (Terza et al, 2008), we demonstrate that the consistency of 2SRI estimators only holds under the following conditions: (1) when the null hypothesis is true; (2) when the outcome model is collapsible; or (3) when estimating the nonnull causal effect from Cox or logistic regression models, the strong and unrealistic assumption that the effect of the unmeasured covariates on the treatment is proportional to their effect on the outcome needs to hold. We propose a novel dissimilarity metric to provide an intuitive explanation of the bias of 2SRI estimators in noncollapsible models and demonstrate that with increasing dissimilarity between the effects of the unmeasured covariates on the treatment versus outcome, the bias of 2SRI increases in magnitude.  相似文献   

3.
An adjustment for an uncorrelated covariate in a logistic regression changes the true value of an odds ratio for a unit increase in a risk factor. Even when there is no variation due to covariates, the odds ratio for a unit increase in a risk factor also depends on the distribution of the risk factor. We can use an instrumental variable to consistently estimate a causal effect in the presence of arbitrary confounding. With a logistic outcome model, we show that the simple ratio or two‐stage instrumental variable estimate is consistent for the odds ratio of an increase in the population distribution of the risk factor equal to the change due to a unit increase in the instrument divided by the average change in the risk factor due to the increase in the instrument. This odds ratio is conditional within the strata of the instrumental variable, but marginal across all other covariates, and is averaged across the population distribution of the risk factor. Where the proportion of variance in the risk factor explained by the instrument is small, this is similar to the odds ratio from a RCT without adjustment for any covariates, where the intervention corresponds to the effect of a change in the population distribution of the risk factor. This implies that the ratio or two‐stage instrumental variable method is not biased, as has been suggested, but estimates a different quantity to the conditional odds ratio from an adjusted multiple regression, a quantity that has arguably more relevance to an epidemiologist or a policy maker, especially in the context of Mendelian randomization. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

4.
Causal estimates can be obtained by instrumental variable analysis using a two-stage method. However, these can be biased when the instruments are weak. We introduce a Bayesian method, which adjusts for the first-stage residuals in the second-stage regression and has much improved bias and coverage properties. In the continuous outcome case, this adjustment reduces median bias from weak instruments to close to zero. In the binary outcome case, bias from weak instruments is reduced and the estimand is changed from a marginal population-based effect to a conditional effect. The lack of distributional assumptions on the posterior distribution of the causal effect gives a better summary of uncertainty and more accurate coverage levels than methods that rely on the asymptotic distribution of the causal estimate. We discuss these properties in the context of Mendelian randomization.  相似文献   

5.
Mendelian randomization studies estimate causal effects using genetic variants as instruments. Instrumental variable methods are straightforward for linear models, but epidemiologists often use odds ratios to quantify effects. Also, odds ratios are often the quantities reported in meta‐analyses. Many applications of Mendelian randomization dichotomize genotype and estimate the population causal log odds ratio for unit increase in exposure by dividing the genotype‐disease log odds ratio by the difference in mean exposure between genotypes. This ‘Wald‐type’ estimator is biased even in large samples, but whether the magnitude of bias is of practical importance is unclear. We study the large‐sample bias of this estimator in a simple model with a continuous normally distributed exposure, a single unobserved confounder that is not an effect modifier, and interpretable parameters. We focus on parameter values that reflect scenarios in which we apply Mendelian randomization, including realistic values for the degree of confounding and strength of the causal effect. We evaluate this estimator and the causal odds ratio using numerical integration and obtain approximate analytic expressions to check results and gain insight. A small simulation study examines finite sample bias and mild violations of the normality assumption. For our simple data‐generating model, we find that the Wald estimator is asymptotically biased with a bias of around 10% in fairly typical Mendelian randomization scenarios but which can be larger in more extreme situations. Recently developed methods such as structural mean models require fewer untestable assumptions and we recommend their use when the individual‐level data they require are available. The Wald‐type estimator may retain a role as an approximate method for meta‐analysis based on summary data. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

6.
It is well established that odds ratios estimated by logistic regression are subject to bias if exposure is measured with error. The dependence of this bias on exposure parameter values, particularly for multiplicative measurement error, and its implications in epidemiology are not, however, as fully acknowledged. We have been motivated by a German West case-control study on lung cancer and residential radon, where restriction to a subgroup exhibiting larger mean and variance of exposure than the entire group has shown higher odds ratio estimates as compared to the full analysis. By means of correction formulae and simulations, we show that bias from additive classical type error depends on the exposure variance, not on the exposure mean, and that bias from multiplicative classical type error depends on the geometric standard deviation (in other words on the coefficient of variation of exposure), but not on the geometric mean of exposure. Bias from additive or multiplicative Berkson type error is independent of exposure distribution parameters. This indicates that there is a potential of differential bias between groups where these parameters vary. Such groups are commonly compared in epidemiology: for example when the results of subgroup analyses are contrasted or meta-analyses are performed. For the German West radon study, we show that the difference of measurement error bias between the subgroup and the entire group exhibits the same direction but not the same dimension as the observed results. Regarding meta-analysis of five European radon studies, we find that a study such as this German study will necessarily result in smaller odds ratio estimates than other studies due to the smaller exposure variance and coefficient of variation of exposure. Therefore, disregard of measurement error can not only lead to biased estimates, but also to inconsistent results and wrongly concluded effect differences between groups.  相似文献   

7.
Genetic markers can be used as instrumental variables, in an analogous way to randomization in a clinical trial, to estimate the causal relationship between a phenotype and an outcome variable. Our purpose is to extend the existing methods for such Mendelian randomization studies to the context of multiple genetic markers measured in multiple studies, based on the analysis of individual participant data. First, for a single genetic marker in one study, we show that the usual ratio of coefficients approach can be reformulated as a regression with heterogeneous error in the explanatory variable. This can be implemented using a Bayesian approach, which is next extended to include multiple genetic markers. We then propose a hierarchical model for undertaking a meta‐analysis of multiple studies, in which it is not necessary that the same genetic markers are measured in each study. This provides an overall estimate of the causal relationship between the phenotype and the outcome, and an assessment of its heterogeneity across studies. As an example, we estimate the causal relationship of blood concentrations of C‐reactive protein on fibrinogen levels using data from 11 studies. These methods provide a flexible framework for efficient estimation of causal relationships derived from multiple studies. Issues discussed include weak instrument bias, analysis of binary outcome data such as disease risk, missing genetic data, and the use of haplotypes. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

8.
This study is motivated by the potential problem of using observational data to draw inferences about treatment outcomes when experimental data are not available. We compare two statistical approaches, ordinary least-squares (OLS) and instrumental variables (IV) regression analysis, to estimate the outcomes (three-year post-treatment survival) of three treatments for early stage breast cancer in elderly women: mastectomy (MST), breast conserving surgery with radiation therapy (BCSRT), and breast conserving surgery only (BCSO). The primary data source was Medicare claims for a national random sample of 2907 women (age 67 or older) with localized breast cancer who were treated between 1992 and 1994.Contrary to randomized clinical trial (RCT) results, analysis with the observational data found highly significant differences in survival among the three treatment alternatives: 79.2% survival for BCSO, 85.3% for MST, and 93.0% for BCSRT. Using OLS to control for the effects of observable characteristics narrowed the estimated survival rate differences, which remained statistically significant. In contrast, the IV analysis estimated survival rate differences that were not significantly different from 0. However, the IV-point estimates of the treatment effects were quantitatively larger than the OLS estimates, unstable, and not significantly different from the OLS results. In addition, both sets of estimates were in the same quantitative range as the RCT results.We conclude that unadjusted observational data on health outcomes of alternative treatments for localized breast cancer should not be used for cost-effectiveness studies. Our comparisons suggest that whether one places greater confidence in the OLS or the IV results depends on at least three factors: (1) the extent of observable health information that can be used as controls in OLS estimation, (2) the outcomes of statistical tests of the validity of the instrumental variable method, and (3) the similarity of the OLS and IV estimates. In this particular analysis, the OLS estimates appear to be preferable because of the instability of the IV estimates.  相似文献   

9.
Modern epidemiologic studies often aim to evaluate the causal effect of a point exposure on the risk of a disease from cohort or case-control observational data. Because confounding bias is of serious concern in such non-experimental studies, investigators routinely adjust for a large number of potential confounders in a logistic regression analysis of the effect of exposure on disease outcome. Unfortunately, when confounders are not correctly modeled, standard logistic regression is likely biased in its estimate of the effect of exposure, potentially leading to erroneous conclusions. We partially resolve this serious limitation of standard logistic regression analysis with a new iterative approach that we call ProRetroSpective estimation, which carefully combines standard logistic regression with a logistic regression analysis in which exposure is the dependent variable and the outcome and confounders are the independent variables. As a result, we obtain a correct estimate of the exposure-outcome odds ratio, if either thestandard logistic regression of the outcome given exposure and confounding factors is correct, or the regression model of exposure given the outcome and confounding factors is correct but not necessarily both, that is, it is double-robust. In fact, it also has certain advantadgeous efficiency properties. The approach is general in that it applies to both cohort and case-control studies whether the design of the study is matched or unmatched on a subset of covariates. Finally, an application illustrates the methods using data from the National Cancer Institute's Black/White Cancer Survival Study.  相似文献   

10.
Instrumental variable (IV) analysis has been widely used in economics, epidemiology, and other fields to estimate the causal effects of covariates on outcomes, in the presence of unobserved confounders and/or measurement errors in covariates. However, IV methods for time‐to‐event outcome with censored data remain underdeveloped. This paper proposes a Bayesian approach for IV analysis with censored time‐to‐event outcome by using a two‐stage linear model. A Markov chain Monte Carlo sampling method is developed for parameter estimation for both normal and non‐normal linear models with elliptically contoured error distributions. The performance of our method is examined by simulation studies. Our method largely reduces bias and greatly improves coverage probability of the estimated causal effect, compared with the method that ignores the unobserved confounders and measurement errors. We illustrate our method on the Women's Health Initiative Observational Study and the Atherosclerosis Risk in Communities Study. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

11.
When modeling the risk of a disease, the very act of selecting the factors to be included can heavily impact the results. This study compares the performance of several variable selection techniques applied to logistic regression. We performed realistic simulation studies to compare five methods of variable selection: (1) a confidence interval (CI) approach for significant coefficients, (2) backward selection, (3) forward selection, (4) stepwise selection, and (5) Bayesian stochastic search variable selection (SSVS) using both informed and uniformed priors. We defined our simulated diseases mimicking odds ratios for cancer risk found in the literature for environmental factors, such as smoking; dietary risk factors, such as fiber; genetic risk factors, such as XPD; and interactions. We modeled the distribution of our covariates, including correlation, after the reported empirical distributions of these risk factors. We also used a null data set to calibrate the priors of the Bayesian method and evaluate its sensitivity. Of the standard methods (95 per cent CI, backward, forward, and stepwise selection) the CI approach resulted in the highest average per cent of correct associations and the lowest average per cent of incorrect associations. SSVS with an informed prior had a higher average per cent of correct associations and a lower average per cent of incorrect associations than the CI approach. This study shows that the Bayesian methods offer a way to use prior information to both increase power and decrease false-positive results when selecting factors to model complex disease risk.  相似文献   

12.
目的  通过统计模拟和实例数据分析,探索当存在不可观测的混杂因素时,Logistic回归分析模型中调整工具变量(instrumental variable, Ⅳ)对估计因果效应的影响。 方法  设定变量均服从二项分布,在Logistic回归分析模型中依次使用不同的参数进行统计模拟,以因果效应估计值的偏倚和标准误作为评价指标;实例数据分析是基于山东省多家医院健康体检中心的体检随访数据,以高血压为目标结局,构建纵向观察队列,筛选单核苷酸多态性(single nucleotide polymorphism, SNP)位点rs12149832作为Ⅳ,在Logistic回归分析模型中,采用不同策略(纳入/不纳入rs12149832协变量)来分析BMI与患高血压风险之间的关系。 结果  统计模拟结果显示在以Logistic回归分析模型估计暴露与结局间的效应时,协变量集中纳入Ⅳ会增大效应估计的偏倚和标准误,但增大程度较小;实例分析中,高血压队列共纳入1 240名女性,基线年龄为(37.7±10.5)岁,BMI为(22.1±3.1)kg/m2。纳入Ⅳ的模型所得的效应估计值为0.225(P<0.001),略小于不包含Ⅳ的回归模型所得的效应估计值(0.228, P<0.001),基本验证了关于纳入Ⅳ进行调整的统计模拟结果。 结论  观察性流行病学研究中,Logistic回归分析模型误纳入Ⅳ对效应估计值的偏倚和标准误均有影响。  相似文献   

13.
Instrumental variable (IV) methods are widely used in the health economics literature to adjust for hidden selection biases in observational studies when estimating treatment effects. Less attention has been paid in the applied literature to the proper use of IVs if treatment effects are heterogeneous across subjects and individuals select treatments based on expected idiosyncratic gains or losses from treatments. In this paper we compare conventional IV analysis with alternative approaches that use IVs to estimate treatment effects in models with response heterogeneity and self-selection. Instead of interpreting IV estimates as the effect of treatment at an unknown margin of patients, we identify the marginal patients and we apply the method of local IVs to estimate the average treatment effect and the effect on the treated on 5-year direct costs of breast-conserving surgery and radiation therapy compared with mastectomy in breast cancer patients. We use a sample from the Outcomes and Preferences in Older Women, Nationwide Survey which is designed to be representative of all female Medicare beneficiaries (aged 67 or older) with newly diagnosed breast cancer between 1992 and 1994. Our results reveal some of the advantages and limitations of conventional and alternative IV methods in estimating mean treatment effect parameters.  相似文献   

14.
Estimation of the effect of one treatment compared to another in the absence of randomization is a common problem in biostatistics. An increasingly popular approach involves instrumental variables-variables that are predictive of who received a treatment yet not directly predictive of the outcome. When treatment is binary, many estimators have been proposed: method-of-moments estimators using a two-stage least-squares procedure, generalized-method-of-moments estimators using two-stage predictor substitution or two-stage residual inclusion procedures, and likelihood-based latent variable approaches. The critical assumptions to the consistency of two-stage procedures and of the likelihood-based procedures differ. Because neither set of assumptions can be completely tested from the observed data alone, comparing the results from the different approaches is an important sensitivity analysis. We provide a general statistical framework for estimation of the casual effect of a binary treatment on a continuous outcome using simultaneous equations to specify models. A comparison of health care costs for adults with schizophrenia treated with newer atypical antipsychotics and those treated with conventional antipsychotic medications illustrates our methods. Surprisingly large differences in the results among the methods are investigated using a simulation study. Several new findings concerning the performance in terms of precision and robustness of each approach in different situations are obtained. We illustrate that in general supplemental information is needed to determine which analysis, if any, is trustworthy and reaffirm that comparing results from different approaches is a valuable sensitivity analysis.  相似文献   

15.
This paper addresses the modelling of missing covariate data with the logistic regression model. The aim of this paper is to evaluate the properties of an efficient score for logistic regression in a two-phase design. Simulation studies show that the efficient score is more efficient than two other pseudo-likelihood methods when the correlation between the missing covariate and its surrogate is high or the sampling proportion is small. These methods are illustrated with data from the National Wilms Tumor Study Group. Results from the example confirm the simulation study findings with the exception that the pseudo-likelihood approach produces more reliable estimates than the weighted pseudo-likelihood approach.  相似文献   

16.
17.
Liu A  Wu C  Yu KF  Gehan E 《Statistics in medicine》2005,24(7):1009-1027
We consider estimation of various probabilities after termination of a group sequential phase II trial. A motivating example is that the stopping rule of a phase II oncologic trial is determined solely based on response to a drug treatment, and at the end of the trial estimating the rate of toxicity and response is desirable. The conventional maximum likelihood estimator (sample proportion) of a probability is shown to be biased, and two alternative estimators are proposed to correct for bias, a bias-reduced estimator obtained by using Whitehead's bias-adjusted approach, and an unbiased estimator from the Rao-Blackwell method of conditioning. All three estimation procedures are shown to have certain invariance property in bias. Moreover, estimators of a probability and their bias and precision can be evaluated through the observed response rate and the stage at which the trial stops, thus avoiding extensive computation.  相似文献   

18.
19.
Motivated by a matched case-control study to investigate potential risk factors for meningococcal disease amongst adolescents, we consider the analysis of matched case-control studies where disease incidence, and possibly other risk factors, vary with time of year. For the cases, the time of infection may be recorded. For controls, however, the recorded time is simply the time of data collection, which is shortly after the time of infection for the matched case, and so depends on the latter. We show that the effect of risk factors and interactions may be adjusted for the time of year effect in a standard conditional logistic regression analysis without introducing any bias. We also show that, if the time delay between data collection for cases and controls is constant, provided this delay is not very short, estimates of the time of year effect are approximately unbiased. In the case that the length of the delay varies over time, the estimate of the time of year effect is biased. We obtain an approximate expression for the degree of bias in this case.  相似文献   

20.
Statistical methods for identifying harmful chemicals in a correlated mixture often assume linearity in exposure-response relationships. Nonmonotonic relationships are increasingly recognized (eg, for endocrine-disrupting chemicals); however, the impact of nonmonotonicity on exposure selection has not been evaluated. In a simulation study, we assessed the performance of Bayesian kernel machine regression (BKMR), Bayesian additive regression trees (BART), Bayesian structured additive regression with spike-slab priors (BSTARSS), generalized additive models with double penalty (GAMDP) and thin plate shrinkage smoothers (GAMTS), multivariate adaptive regression splines (MARS), and lasso penalized regression. We simulated realistic exposure data based on pregnancy exposure to 17 phthalates and phenols in the US National Health and Nutrition Examination Survey using a multivariate copula. We simulated data sets of size N = 250 and compared methods across 32 scenarios, varying by model size and sparsity, signal-to-noise ratio, correlation structure, and exposure-response relationship shapes. We compared methods in terms of their sensitivity, specificity, and estimation accuracy. In most scenarios, BKMR, BSTARSS, GAMDP, and GAMTS achieved moderate to high sensitivity (0.52-0.98) and specificity (0.21-0.99). BART and MARS achieved high specificity (≥0.90), but low sensitivity in low signal-to-noise ratio scenarios (0.20-0.51). Lasso was highly sensitive (0.71-0.99), except for quadratic relationships (≤0.27). Penalized regression methods that assume linearity, such as lasso, may not be suitable for studies of environmental chemicals hypothesized to have nonmonotonic relationships with outcomes. Instead, BKMR, BSTARSS, GAMDP, and GAMTS are attractive methods for flexibly estimating the shapes of exposure-response relationships and selecting among correlated exposures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号