首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The statistical practice of modeling interaction with two linear main effects and a product term is ubiquitous in the statistical and epidemiological literature. Most data modelers are aware that the misspecification of main effects can potentially cause severe type I error inflation in tests for interactions, leading to spurious detection of interactions. However, modeling practice has not changed. In this article, we focus on the specific situation where the main effects in the model are misspecified as linear terms and characterize its impact on common tests for statistical interaction. We then propose some simple alternatives that fix the issue of potential type I error inflation in testing interaction due to main effect misspecification. We show that when using the sandwich variance estimator for a linear regression model with a quantitative outcome and two independent factors, both the Wald and score tests asymptotically maintain the correct type I error rate. However, if the independence assumption does not hold or the outcome is binary, using the sandwich estimator does not fix the problem. We further demonstrate that flexibly modeling the main effect under a generalized additive model can largely reduce or often remove bias in the estimates and maintain the correct type I error rate for both quantitative and binary outcomes regardless of the independence assumption. We show, under the independence assumption and for a continuous outcome, overfitting and flexibly modeling the main effects does not lead to power loss asymptotically relative to a correctly specified main effect model. Our simulation study further demonstrates the empirical fact that using flexible models for the main effects does not result in a significant loss of power for testing interaction in general. Our results provide an improved understanding of the strengths and limitations for tests of interaction in the presence of main effect misspecification. Using data from a large biobank study “The Michigan Genomics Initiative”, we present two examples of interaction analysis in support of our results.  相似文献   

2.
In randomized clinical trials with survival outcome, there has been an increasing interest in subgroup identification based on baseline genomic, proteomic markers, or clinical characteristics. Some of the existing methods identify subgroups that benefit substantially from the experimental treatment by directly modeling outcomes or treatment effect. When the goal is to find an optimal treatment for a given patient rather than finding the right patient for a given treatment, methods under the individualized treatment regime framework estimate an individualized treatment rule that would lead to the best expected clinical outcome as measured by a value function. Connecting the concept of value function to subgroup identification, we propose a nonparametric method that searches for subgroup membership scores by maximizing a value function that directly reflects the subgroup-treatment interaction effect based on restricted mean survival time. A gradient tree boosting algorithm is proposed to search for the individual subgroup membership scores. We conduct simulation studies to evaluate the performance of the proposed method and an application to an AIDS clinical trial is performed for illustration.  相似文献   

3.
There is growing interest and investment in precision medicine as a means to provide the best possible health care. A treatment regime formalizes precision medicine as a sequence of decision rules, one per clinical intervention period, that specify if, when and how current treatment should be adjusted in response to a patient's evolving health status. It is standard to define a regime as optimal if, when applied to a population of interest, it maximizes the mean of some desirable clinical outcome, such as efficacy. However, in many clinical settings, a high‐quality treatment regime must balance multiple competing outcomes; eg, when a high dose is associated with substantial symptom reduction but a greater risk of an adverse event. We consider the problem of estimating the most efficacious treatment regime subject to constraints on the risk of adverse events. We combine nonparametric Q‐learning with policy‐search to estimate a high‐quality yet parsimonious treatment regime. This estimator applies to both observational and randomized data, as well as settings with variable, outcome‐dependent follow‐up, mixed treatment types, and multiple time points. This work is motivated by and framed in the context of dosing for chronic pain; however, the proposed framework can be applied generally to estimate a treatment regime which maximizes the mean of one primary outcome subject to constraints on one or more secondary outcomes. We illustrate the proposed method using data pooled from 5 open‐label flexible dosing clinical trials for chronic pain.  相似文献   

4.
In causal inference, often the interest lies in the estimation of the average causal effect. Other quantities such as the quantile treatment effect may be of interest as well. In this article, we propose a multiply robust method for estimating the marginal quantiles of potential outcomes by achieving mean balance in (a) the propensity score, and (b) the conditional distributions of potential outcomes. An empirical likelihood or entropy measure approach can be utilized for estimation instead of inverse probability weighting, which is known to be sensitive to the misspecification of the propensity score model. Simulation studies are conducted across different scenarios of correctness in both the propensity score models and the outcome models. Both simulation results and theoretical development indicate that our proposed estimator is consistent if any of the models are correctly specified. In the data analysis, we investigate the quantile treatment effect of mothers' smoking status on infants' birthweight.  相似文献   

5.
Because different patients may respond quite differently to the same drug or treatment, there is an increasing interest in discovering individualized treatment rules. In particular, there is an emerging need to find optimal individualized treatment rules, which would lead to the “best” clinical outcome. In this paper, we propose a new class of loss functions and estimators based on robust regression to estimate the optimal individualized treatment rules. Compared to existing estimation methods in the literature, the new estimators are novel and advantageous in the following aspects. First, they are robust against skewed, heterogeneous, heavy-tailed errors or outliers in data. Second, they are robust against a misspecification of the baseline function. Third, under some general situations, the new estimator coupled with the pinball loss approximately maximizes the outcome's conditional quantile instead of the conditional mean, which leads to a more robust optimal individualized treatment rule than the traditional mean-based estimators. Consistency and asymptotic normality of the proposed estimators are established. Their empirical performance is demonstrated via extensive simulation studies and an analysis of an AIDS data set.  相似文献   

6.
Covariates associated with treatment-effect heterogeneity can potentially be used to make personalized treatment recommendations towards best clinical outcomes. Methods for treatment-selection rule development that directly maximize treatment-selection benefits have attracted much interest in recent years, due to the robustness of these methods to outcome modeling. In practice, the task of treatment-selection rule development can be further complicated by missingness in data. Here, we consider the identification of optimal treatment-selection rules for a binary disease outcome when measurements of an important covariate from study participants are partly missing. Under the missing at random assumption, we develop a robust estimator of treatment-selection rules under the direct-optimization paradigm. This estimator targets the maximum selection benefits to the population under correct specification of at least one mechanism from each of the two sets—missing data or conditional covariate distribution, and treatment assignment or disease outcome model. We evaluate and compare performance of the proposed estimator with alternative direct-optimization estimators through extensive simulation studies. We demonstrate the application of the proposed method through a real data example from an Alzheimer's disease study for developing covariate combinations to guide the treatment of Alzheimer's disease.  相似文献   

7.
Multivariate Gaussian mixtures are a class of models that provide a flexible parametric approach for the representation of heterogeneous multivariate outcomes. When the outcome is a vector of repeated measurements taken on the same subject, there is often inherent dependence between observations. However, a common covariance assumption is conditional independence—that is, given the mixture component label, the outcomes for subjects are independent. In this paper, we study, through asymptotic bias calculations and simulation, the impact of covariance misspecification in multivariate Gaussian mixtures. Although maximum likelihood estimators of regression and mixing probability parameters are not consistent under misspecification, they have little asymptotic bias when mixture components are well separated or if the assumed correlation is close to the truth even when the covariance is misspecified. We also present a robust standard error estimator and show that it outperforms conventional estimators in simulations and can indicate that the model is misspecified. Body mass index data from a national longitudinal study are used to demonstrate the effects of misspecification on potential inferences made in practice. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

8.
An extension to the version of the regression calibration estimator proposed by Rosner et al. for logistic and other generalized linear regression models is given for main study/internal validation study designs. This estimator combines the information about the parameter of interest contained in the internal validation study with Rosner et al.'s regression calibration estimate, using a generalized inverse-variance weighted average. It is shown that the validation study selection model can be ignored as long as this model is jointly independent of the outcome and the incompletely observed covariates, conditional, at most, upon the surrogates and other completely observed covariates. In an extensive simulation study designed to follow a complex, multivariate setting in nutritional epidemiology, it is shown that with validation study sizes of 340 or more, this estimator appears to be asymptotically optimal in the sense that it is nearly unbiased and nearly as efficient as a properly specified maximum likelihood estimator. A modification to the regression calibration variance estimator which replaces the standard uncorrected logistic regression coefficient variance with the sandwich estimator to account for the possible misspecification of the logistic regression fit to the surrogate covariates in the main study, was also studied in this same simulation experiment. In this study, the alternative variance formula yielded results virtually identical to the original formula. A version of the proposed estimator is also derived for the case where the reference instrument, available only in the validation study, is imperfect but unbiased at the individual level and contains error that is uncorrelated with other covariates and with error in the surrogate instrument. Replicate measures are obtained in a subset of study participants. In this case it is shown that the validation study selection model can be ignored when sampling into the validation study depends, at most, only upon perfectly measured covariates. Two data sets, a study of fever in relation to occupational exposure to antineoplastics among hospital pharmacists and a study of breast cancer incidence in relation to dietary intakes of alcohol and vitamin A, adjusted for total energy intake, from the Nurses' Health Study, were analysed using these new methods. In these data, because the validation studies contained less than 200 observations and the events of interest were relatively rare, as is typical, the potential improvements offered by this new estimator were not apparent.  相似文献   

9.
Even in the absence of unmeasured confounding factors or model misspecification, standard methods for estimating the causal effect of a time-varying treatment on the mean of a repeated measures outcome (for example, GEE regression) may be biased when there are time-dependent variables that are simultaneously confounders of the effect of interest and are predicted by previous treatment. In contrast, the recently developed marginal structural models (MSMs) can provide consistent estimates of causal effects when unmeasured confounding and model misspecification are absent. We describe an MSM for repeated measures that parameterizes the marginal means of counterfactual outcomes corresponding to prespecified treatment regimes. The parameters of MSMs are estimated using a new class of estimators - inverse-probability of treatment weighted estimators. We used an MSM to estimate the effect of zidovudine therapy on mean CD4 count among HIV-infected men in the Multicenter AIDS Cohort Study. We estimated a potential expected increase of 5.4 (95 per cent confidence interval -1.8,12.7) CD4 lymphocytes/l per additional study visit while on zidovudine therapy. We also explain the theory and implementation of MSMs for repeated measures data and draw upon a simple example to illustrate the basic ideas.  相似文献   

10.
Dynamic treatment regimes operationalize precision medicine as a sequence of decision rules, one per stage of clinical intervention, that map up-to-date patient information to a recommended intervention. An optimal treatment regime maximizes the mean utility when applied to the population of interest. Methods for estimating an optimal treatment regime assume the data to be fully observed, which rarely occurs in practice. A common approach is to first use multiple imputation and then pool the estimators across imputed datasets. However, this approach requires estimating the joint distribution of patient trajectories, which can be high-dimensional, especially when there are multiple stages of intervention. We examine the application of inverse probability weighted estimating equations as an alternative to multiple imputation in the context of monotonic missingness. This approach applies to a broad class of estimators of an optimal treatment regime including both Q-learning and a generalization of outcome weighted learning. We establish consistency under mild regularity conditions and demonstrate its advantages in finite samples using a series of simulation experiments and an application to a schizophrenia study.  相似文献   

11.
12.
Burden analysis in public health often involves the estimation of exposure‐attributable fractions from observed time series. When the entire population is exposed, the association between the exposure and outcome must be carefully modelled before the attributable fractions can be estimated. This article derives asymptotic convergences for the estimation of attributable fractions for commonly used time series models (ARMAX, Poisson, negative binomial, and Serfling), using for the most part the delta method. For the Poisson regression, the estimation of the attributable fraction is achieved by a Monte Carlo algorithm, taking into account both an estimation and a prediction error. A simulation study compares these estimations in the case of an epidemic exposure and highlights the importance of thorough analysis of the data: When the outcome is generated under an additive model, the additive models are satisfactory, and the multiplicative models are poor, and vice versa. However, the Serfling model performs poorly in all cases. Of note, a misspecification in the form or delay of the association between the exposure and the outcome leads to mediocre estimation of the attributable fraction. An application to the fraction of French outpatient antibiotic use attributable to influenza between 2003 and 2010 illustrates the asymptotic convergences. This study suggests that the Serfling model should be avoided when estimating attributable fractions while the model of choice should be selected after careful investigation of the association between the exposure and outcome.  相似文献   

13.
Equal randomization has been a popular choice in clinical trial practice. However, in trials with heterogeneous variances and/or variable treatment costs, as well as in settings where maximization of every trial participant's benefit is an important design consideration, optimal allocation proportions may be unequal across study treatment arms. In this paper, we investigate optimal allocation designs minimizing study cost under statistical efficiency constraints for parallel group clinical trials comparing several investigational treatments against the control. We show theoretically that equal allocation designs may be suboptimal, and unequal allocation designs can provide higher statistical power for the same budget or result in a smaller cost for the same level of power. We also show how optimal allocation can be implemented in practice by means of restricted randomization procedures and how to perform statistical inference following these procedures, using invoked population-based or randomization-based approaches. Our results provide further support to some previous findings in the literature that unequal randomization designs can be cost efficient and can be successfully implemented in practice. We conclude that the choice of the target allocation, the randomization procedure, and the statistical methodology for data analysis is an essential component in ensuring valid, powerful, and robust clinical trial results.  相似文献   

14.
In clinical studies with time‐to‐event as a primary endpoint, one main interest is to find the best treatment strategy to maximize patients' mean survival time. Due to patient's heterogeneity in response to treatments, great efforts have been devoted to developing optimal treatment regimes by integrating individuals' clinical and genetic information. A main challenge arises in the selection of important variables that can help to build reliable and interpretable optimal treatment regimes as the dimension of predictors may be high. In this paper, we propose a robust loss‐based estimation framework that can be easily coupled with shrinkage penalties for both estimation of optimal treatment regimes and variable selection. The asymptotic properties of the proposed estimators are studied. Moreover, a model‐free estimator of restricted mean survival time under the derived optimal treatment regime is developed, and its asymptotic property is studied. Simulations are conducted to assess the empirical performance of the proposed method for parameter estimation, variable selection, and optimal treatment decision. An application to an AIDS clinical trial data set is given to illustrate the method. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

15.
In medical therapies involving multiple stages, a physician's choice of a subject's treatment at each stage depends on the subject's history of previous treatments and outcomes. The sequence of decisions is known as a dynamic treatment regime or treatment policy. We consider dynamic treatment regimes in settings where each subject's final outcome can be defined as the sum of longitudinally observed values, each corresponding to a stage of the regime. Q‐learning, which is a backward induction method, is used to first optimize the last stage treatment then sequentially optimize each previous stage treatment until the first stage treatment is optimized. During this process, model‐based expectations of outcomes of late stages are used in the optimization of earlier stages. When the outcome models are misspecified, bias can accumulate from stage to stage and become severe, especially when the number of treatment stages is large. We demonstrate that a modification of standard Q‐learning can help reduce the accumulated bias. We provide a computational algorithm, estimators, and closed‐form variance formulas. Simulation studies show that the modified Q‐learning method has a higher probability of identifying the optimal treatment regime even in settings with misspecified models for outcomes. It is applied to identify optimal treatment regimes in a study for advanced prostate cancer and to estimate and compare the final mean rewards of all the possible discrete two‐stage treatment sequences. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

16.
Individualized treatment rules, or rules for altering treatments over time in response to changes in individual covariates, are of primary importance in the practice of clinical medicine. Several statistical methods aim to estimate the rule, termed an optimal dynamic treatment regime, which will result in the best expected outcome in a population. In this article, we discuss estimation of an alternative type of dynamic regime-the statically optimal treatment rule. History-adjusted marginal structural models (HA-MSM) estimate individualized treatment rules that assign, at each time point, the first action of the future static treatment plan that optimizes expected outcome given a patient's covariates. However, as we discuss here, HA-MSM-derived rules can depend on the way in which treatment was assigned in the data from which the rules were derived. We discuss the conditions sufficient for treatment rules identified by HA-MSM to be statically optimal, or in other words, to select the optimal future static treatment plan at each time point, regardless of the way in which past treatment was assigned. The resulting treatment rules form appropriate candidates for evaluation using randomized controlled trials. We demonstrate that a history-adjusted individualized treatment rule is statically optimal if it depends on a set of covariates that are sufficient to control for confounding of the effect of past treatment history on outcome. Methods and results are illustrated using an example drawn from the antiretroviral treatment of patients infected with HIV. Specifically, we focus on rules for deciding when to modify the treatment of patients infected with resistant virus.  相似文献   

17.
Epidemiologic research often aims to estimate the association between a binary exposure and a binary outcome, while adjusting for a set of covariates (eg, confounders). When data are clustered, as in, for instance, matched case-control studies and co-twin-control studies, it is common to use conditional logistic regression. In this model, all cluster-constant covariates are absorbed into a cluster-specific intercept, whereas cluster-varying covariates are adjusted for by explicitly adding these as explanatory variables to the model. In this paper, we propose a doubly robust estimator of the exposure-outcome odds ratio in conditional logistic regression models. This estimator protects against bias in the odds ratio estimator due to misspecification of the part of the model that contains the cluster-varying covariates. The doubly robust estimator uses two conditional logistic regression models for the odds ratio, one prospective and one retrospective, and is consistent for the exposure-outcome odds ratio if at least one of these models is correctly specified, not necessarily both. We demonstrate the properties of the proposed method by simulations and by re-analyzing a publicly available dataset from a matched case-control study on induced abortion and infertility.  相似文献   

18.
The inverse probability weighted estimator is often applied to two-phase designs and regression with missing covariates. Inverse probability weighted estimators typically are less efficient than likelihood-based estimators but, in general, are more robust against model misspecification. In this paper, we propose a best linear inverse probability weighted estimator for two-phase designs and missing covariate regression. Our proposed estimator is the projection of the SIPW onto the orthogonal complement of the score space based on a working regression model of the observed covariate data. The efficiency gain is from the use of the association between the outcome variable and the available covariates, which is the working regression model. One advantage of the proposed estimator is that there is no need to calculate the augmented term of the augmented weighted estimator. The estimator can be applied to general missing data problems or two-phase design studies in which the second phase data are obtained in a subcohort. The method can also be applied to secondary trait case-control genetic association studies. The asymptotic distribution is derived, and the finite sample performance of the proposed estimator is examined via extensive simulation studies. The methods are applied to a bladder cancer case-control study.  相似文献   

19.
Establishing and characterizing exposure-biomarker relationships is an important problem in molecular epidemiology. The problem is difficult due to several complicating features, namely, the biomarker response is a nonlinear function of exposure and unknown parameters; variation in exposure and biomarker levels occurs both within and between subjects; and errors tend to be heteroscedastic. To overcome some of the statistical challenges in analysing such data, it is common for the investigator to make several assumptions about the data structure. For example, it is common to assume that the natural logarithm of right-skewed, biomarker measurements lead to homoscedasicity and normality so the effect of outliers is minimized and Gauss-Markov theory is applicable. In this paper, we compare a lognormal maximum likelihood estimator (MLE) to generalized estimating equations (GEE) for drawing statistical inference in a nonlinear model of a benzene biomarker (benzene oxide-albumin adducts) as a function of benzene exposure. We explore the characteristic properties of the lognormal MLE under a certain type of model misspecification and compare its small sample performance to the estimating equation approach in simulation studies. We show that the multiplicative lognormal model can lead to severe biases for modest deviations from the true outcome (biomarker) distribution. Furthermore, the lognormal MLE can exhibit very poor small sample properties even under the true model. All methods are applied in a novel data analysis from a study of benzene-exposed workers in China.  相似文献   

20.
When multiple treatment alternatives are available for a disease, an obvious question is which alternative is most effective for which patient. One may address this question by searching for optimal treatment regimes that specify for each individual the preferable treatment alternative based on that individual's baseline characteristics. When such a regime has been estimated, its quality (in terms of the expected outcome if it was used for treatment assignment of all patients in the population under study) is of obvious interest. Obtaining a good and reliable estimate of this quantity is a key challenge for which so far no satisfactory solution is available. In this paper, we consider for this purpose several estimators of the expected outcome in conjunction with several resampling methods. The latter have been evaluated before within the context of statistical learning to estimate the prediction error of estimated prediction rules. Yet, the results of these evaluations were equivocal, with different best performing methods in different studies, and with near-zero and even negative correlations between true and estimated prediction errors. Moreover, for different reasons, it is not straightforward to extrapolate the findings of these studies to the context of optimal treatment regimes. To address these issues, we set up a new and comprehensive simulation study. In this study, combinations of different estimators with .632+ and out-of-bag bootstrap resampling methods performed best. In addition, the study shed a surprising new light on the previously reported problematic correlations between true and estimated prediction errors in the area of statistical learning.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号