首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The estimation of causal effects has been the subject of extensive research. In unconfounded studies with a dichotomous outcome, Y, Cangul, Chretien, Gutman and Rubin (2009) demonstrated that logistic regression for a scalar continuous covariate X is generally statistically invalid for testing null treatment effects when the distributions of X in the treated and control populations differ and the logistic model for Y given X is misspecified. In addition, they showed that an approximately valid statistical test can be generally obtained by discretizing X followed by regression adjustment within each interval defined by the discretized X. This paper extends the work of Cangul et al. 2009 in three major directions. First, we consider additional estimation procedures, including a new one that is based on two independent splines and multiple imputation; second, we consider additional distributional factors; and third, we examine the performance of the procedures when the treatment effect is non‐null. Of all the methods considered and in most of the experimental conditions that were examined, our proposed new methodology appears to work best in terms of point and interval estimation. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

2.
A biomarker (S) measured after randomization in a clinical trial can often provide information about the true endpoint (T) and hence the effect of treatment (Z). It can usually be measured earlier and more easily than T and as such may be useful to shorten the trial length. A potential use of S is to completely replace T as a surrogate endpoint to evaluate whether the treatment is effective. Another potential use of S is to serve as an auxiliary variable to help provide information and improve the inference on the treatment effect prediction when T is not completely observed. The objective of this report is to focus on its role as an auxiliary variable and to identify situations when S can be useful to increase efficiency in predicting the treatment effect in a new trial in a multiple‐trial setting. Both S and T are continuous. We find that higher efficiency gain is associated with higher trial‐level correlation but not individual‐level correlation when only S, but not T is measured in a new trial; but, the amount of information recovery from S is usually negligible. However, when T is partially observed in the new trial and the individual‐level correlation is relatively high, there is substantial efficiency gain by using S. For design purposes, our results suggest that it is often important to collect markers that have high adjusted individual‐level correlation with T and at least a small amount of data on T. The results are illustrated using simulations and an example from a glaucoma clinical trial. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

3.
In many clinical settings, improving patient survival is of interest but a practical surrogate, such as time to disease progression, is instead used as a clinical trial's primary endpoint. A time‐to‐first endpoint (e.g., death or disease progression) is commonly analyzed but may not be adequate to summarize patient outcomes if a subsequent event contains important additional information. We consider a surrogate outcome very generally as one correlated with the true endpoint of interest. Settings of interest include those where the surrogate indicates a beneficial outcome so that the usual time‐to‐first endpoint of death or surrogate event is nonsensical. We present a new two‐sample test for bivariate, interval‐censored time‐to‐event data, where one endpoint is a surrogate for the second, less frequently observed endpoint of true interest. This test examines whether patient groups have equal clinical severity. If the true endpoint rarely occurs, the proposed test acts like a weighted logrank test on the surrogate; if it occurs for most individuals, then our test acts like a weighted logrank test on the true endpoint. If the surrogate is a useful statistical surrogate, our test can have better power than tests based on the surrogate that naively handles the true endpoint. In settings where the surrogate is not valid (treatment affects the surrogate but not the true endpoint), our test incorporates the information regarding the lack of treatment effect from the observed true endpoints and hence is expected to have a dampened treatment effect compared with tests based on the surrogate alone. Published 2016. This article is a U.S. Government work and is in the public domain in the USA.  相似文献   

4.
Wu Z  He P  Geng Z 《Statistics in medicine》2011,30(19):2422-2434
In this paper, we show that Prentice's criteria for surrogates can be strengthened to remedy some weaknesses of the criteria, and we also propose some sufficient conditions under which the treatment effects on a surrogate and on the true endpoint have qualitative implication and equivalence relations. With or without requiring Prentice's criteria, we discuss what conditions are required to qualitatively assess the causal effect of treatment on an unobserved endpoint in terms of an observed surrogate. Rather than a correlation between a surrogate and an endpoint, we require stricter measurements of association for the qualitative assessment. Further we show that these conditions can be satisfied by commonly used models, such as generalized linear models and Cox's proportional hazard models.  相似文献   

5.
Two paradigms for the evaluation of surrogate markers in randomized clinical trials have been proposed: the causal effects paradigm and the causal association paradigm. Each of these paradigms rely on assumptions that must be made to proceed with estimation and to validate a candidate surrogate marker (S) for the true outcome of interest (T). We consider the setting in which S and T are Gaussian and are generated from structural models that include an unobserved confounder. Under the assumed structural models, we relate the quantities used to evaluate surrogacy within both the causal effects and causal association frameworks. We review some of the common assumptions made to aid in estimating these quantities and show that assumptions made within one framework can imply strong assumptions within the alternative framework. We demonstrate that there is a similarity, but not exact correspondence between the quantities used to evaluate surrogacy within each framework, and show that the conditions for identifiability of the surrogacy parameters are different from the conditions, which lead to a correspondence of these quantities.  相似文献   

6.
Although the P value from a Wilcoxon‐Mann‐Whitney test is used often with randomized experiments, it is rarely accompanied with a causal effect estimate and its confidence interval. The natural parameter for the Wilcoxon‐Mann‐Whitney test is the Mann‐Whitney parameter, ?, which measures the probability that a randomly selected individual in the treatment arm will have a larger response than a randomly selected individual in the control arm (plus an adjustment for ties). We show that the Mann‐Whitney parameter may be framed as a causal parameter and show that it is not equal to a closely related and nonidentifiable causal effect, ψ, the probability that a randomly selected individual will have a larger response under treatment than under control (plus an adjustment for ties). We review the paradox, first expressed by Hand, that the ψ parameter may imply that the treatment is worse (or better) than control, while the Mann‐Whitney parameter shows the opposite. Unlike the Mann‐Whitney parameter, ψ is nonidentifiable from a randomized experiment. We review some nonparametric assumptions that rule out Hand's paradox through bounds on ψ and use bootstrap methods to make inferences on those bounds. We explore the relationship of the proportional odds parameter to Hand's paradox, showing that the paradox may occur for proportional odds parameters between 1/9 and 9. Thus, large effects are needed to ensure that if treatment appears better by the Mann‐Whitney parameter, then treatment improves responses in most individuals. We demonstrate these issues using a vaccine trial.  相似文献   

7.
Several methods have been developed for the evaluation of surrogate endpoints within the causal‐inference and meta‐analytic paradigms. In both paradigms, much effort has been made to assess the capacity of the surrogate to predict the causal treatment effect on the true endpoint. In the present work, the so‐called surrogate predictive function (SPF) is introduced for that purpose, using potential outcomes. The relationship between the SPF and the individual causal association, a new metric of surrogacy recently proposed in the literature, is studied in detail. It is shown that the SPF, in conjunction with the individual causal association, can offer an appealing quantification of the surrogate predictive value. However, neither the distribution of the potential outcomes nor the SPF are identifiable from the data. These identifiability issues are tackled using a two‐step procedure. In the first step, the region of the parametric space of the distribution of the potential outcomes, compatible with the data at hand, is geometrically characterized. Further, in a second step, a Monte Carlo approach is used to study the behavior of the SPF on the previous region. The method is illustrated using data from a clinical trial involving schizophrenic patients and a newly developed and user friendly R package Surrogate is provided to carry out the validation exercise. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

8.
In a binary model relating a response variable Y to a risk factor X, account may need to take of an extraneous effect Z that is related to X, but not Y. This is known as the association pattern Y?X?Z. The extraneous variable Z is commonly included in models as a covariate. This paper concerns binary models, and investigates the use of deviation from the group mean (D‐GM) and deviation from the fitted fractional polynomial value (D‐FP) for removing the extraneous effect of Z. In a simulation study, D‐FP performed excellently, while the performance of D‐GM was slightly worse than the traditional method of treating Z as a covariate. In addition, estimators with excessive mean square errors or standard errors cannot occur when D‐GM or D‐FP is employed, even in small or sparse data sets. The Y?X?Z association pattern studied here often occurs in fetal studies, where the fetal measurement (X) varies with the gestation age (Z), but gestation age does not relate to the outcome variable (Y; e.g. Down's syndrome). D‐GM and D‐FP perform well with illustrative data from fetal studies, although there is a weak association between X and Z with a lower proportion of case subjects (e.g. 11:1, control to case). It is not necessary to add a new covariate when a model deals with the extraneous effect. The D‐FP or D‐GM methods perform well with the real data studied here, and moreover, D‐FP demonstrated excellent performance in simulations. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

9.
In an NASA ground study, two forms of cognitive tests were evaluated in terms of their sensitivity to sleepiness induced by the drug promethazine (PMZ). Performance for the two test modes (Y1 and Y2), PMZ concentration, and a self‐reported sleepiness using the Karolinska Sleepiness Scale (KSS) were monitored for 12 h post dose. A problem arises when using KSS to establish an association between true sleepiness and performance because KSS scores are discrete and also because they tend to concentrate on certain values. Therefore, we define a latent sleepiness measure X* as an unobserved continuous random variable describing a subject's actual state of sleepiness. Under the assumption that drug concentration affects X*, which then affects Y1, Y2, and KSS, we use Bayesian methods to estimate joint equations that permit unbiased comparison of the performance measures' sensitivity to X*. The equations incorporate subject random effects and include a negativity constraint on subject‐specific slopes of performance with respect to sleepiness. Published in 2010 by John Wiley & Sons, Ltd.  相似文献   

10.
Randomized Phase II or Phase III clinical trials that are powered based on clinical endpoints, such as survival time, may be prohibitively expensive, in terms of both the time required for their completion and the number of patients required. As such, surrogate endpoints, such as objective tumour response or markers including prostate specific antigen or CA-125, have gained widespread popularity in clinical trials. If an improvement in a surrogate endpoint does not itself confer patient benefit, then consideration must be given to the extent to which improvement in a surrogate endpoint implies improvement in the true clinical endpoint of interest. That this is not a trivial issue is demonstrated by the results of an NIH-sponsored trial of anti-arrhythmic drugs, in which the ability to correct an irregular heart beat not only did not correspond to a survival benefit but in fact led to excess mortality. One approach to the validation of surrogate endpoints involves ensuring that a valid between-group analysis of the surrogate endpoint constitutes also a valid analysis of the true clinical endpoint. The Prentice criterion is a set of conditions that essentially specify the conditional independence of the impact of treatment on the true endpoint, given the surrogate endpoint. It is shown that this criterion alone ensures that an observed effect of the treatment on the true endpoint implies a treatment effect also on the surrogate endpoint, but contrary to popular belief, it does not ensure the converse, specifically that the observation of a significant treatment effect on the surrogate endpoint can be used to infer a treatment effect on the true endpoint.  相似文献   

11.
We address the problem of testing for independence between X and Y in two situations. In the first we assume that the joint distribution of X and Y is unknown but the observations on X and Y are identifiable. In the second case we assume that the distribution of (X, Y) is exchangeable. Here we consider both when (X, Y) are identifiable and when they are not. We illustrate applications to the testing of independence in DNA databases and in twin studies.  相似文献   

12.
In the past decade, many genome‐wide association studies (GWASs) have been conducted to explore association of single nucleotide polymorphisms (SNPs) with complex diseases using a case‐control design. These GWASs not only collect information on the disease status (primary phenotype, D) and the SNPs (genotypes, X), but also collect extensive data on several risk factors and traits. Recent literature and grant proposals point toward a trend in reusing existing large case‐control data for exploring genetic associations of some additional traits (secondary phenotypes, Y ) collected during the study. These secondary phenotypes may be correlated, and a proper analysis warrants a multivariate approach. Commonly used multivariate methods are not equipped to properly account for the non‐random sampling scheme. Current ad hoc practices include analyses without any adjustment, and analyses with D adjusted as a covariate. Our theoretical and empirical studies suggest that the type I error for testing genetic association of secondary traits can be substantial when X as well as Y are associated with D, even when there is no association between X and Y in the underlying (target) population. Whether using D as a covariate helps maintain type I error depends heavily on the disease mechanism and the underlying causal structure (which is often unknown). To avoid grossly incorrect inference, we have proposed proportional odds model adjusted for propensity score (POM‐PS). It uses a proportional odds logistic regression of X on Y and adjusts estimated conditional probability of being diseased as a covariate. We demonstrate the validity and advantage of POM‐PS, and compare to some existing methods in extensive simulation experiments mimicking plausible scenarios of dependency among Y , X, and D. Finally, we use POM‐PS to jointly analyze four adiposity traits using a type 2 diabetes (T2D) case‐control sample from the population‐based Metabolic Syndrome in Men (METSIM) study. Only POM‐PS analysis of the T2D case‐control sample seems to provide valid association signals.  相似文献   

13.
A surrogate endpoint in a randomized clinical trial is an endpoint that occurs after randomization and before the true, clinically meaningful, endpoint that yields conclusions about the effect of treatment on true endpoint. A surrogate endpoint can accelerate the evaluation of new treatments but at the risk of misleading conclusions. Therefore, criteria are needed for deciding whether to use a surrogate endpoint in a new trial. For the meta‐analytic setting of multiple previous trials, each with the same pair of surrogate and true endpoints, this article formulates 5 criteria for using a surrogate endpoint in a new trial to predict the effect of treatment on the true endpoint in the new trial. The first 2 criteria, which are easily computed from a zero‐intercept linear random effects model, involve statistical considerations: an acceptable sample size multiplier and an acceptable prediction separation score. The remaining 3 criteria involve clinical and biological considerations: similarity of biological mechanisms of treatments between the new trial and previous trials, similarity of secondary treatments following the surrogate endpoint between the new trial and previous trials, and a negligible risk of harmful side effects arising after the observation of the surrogate endpoint in the new trial. These 5 criteria constitute an appropriately high bar for using a surrogate endpoint to make a definitive treatment recommendation.  相似文献   

14.
Effect sizes (ES) tell the magnitude of the difference between treatments and, ideally, should tell clinicians how likely their patients will benefit from the treatment. Currently used ES are expressed in statistical rather than in clinically useful terms and may not give clinicians the appropriate information. We restrict our discussion to studies with two groups: one with n patients receiving a new treatment and the other with m patients receiving the usual or no treatment. The standardized mean difference (e.g. Cohen's d) is a well‐known index for continuous outcomes. There is some intuitive value to d, but measuring improvement in standard deviations (SD) is a statistical concept that may not help a clinician. How much improvement is a half SD? A more intuitive and simple‐to‐calculate ES is the probability that the response of a patient given the new treatment (X) is better than the one for a randomly chosen patient given the old or no treatment (Y) (i.e. P(X > Y), larger values meaning better outcomes). This probability has an immediate identity with the area under the curve (AUC) measure in procedures for receiver operator characteristic (ROC) curve comparing responses to two treatments. It also can be easily calculated from the Mann–Whitney U, Wilcoxon, or Kendall τ statistics. We describe the characteristics of an ideal ES. We propose P(X > Y) as an alternative index, summarize its correspondence with well‐known non‐parametric statistics, compare it to the standardized mean difference index, and illustrate with clinical data. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

15.
A statistical definition of surrogate endpoints as well as validation criteria was first presented by Prentice. Freedman et al. supplemented these criteria with the so-called proportion explained. Buyse and Molenberghs pointed to inadequacies of these criteria and suggested a new definition of surrogacy based on (i) the relative effect linking the overall effect of treatment on both endpoints and (ii) an individual-level measure of agreement between both endpoints. Using data from a randomized trial, they showed how a potential surrogate endpoint can be studied using a joint model for the surrogate and the true endpoint. Whereas Buyse and Molenberghs restricted themselves to the fairly simple cases of jointly normal and jointly binary outcomes, we treat the situation where the surrogate is binary and the true endpoint is continuous, or vice versa. In addition, we consider the case of ordinal endpoints. Further, Buyse et al. extended the approach of Buyse and Molenberghs to a meta-analytic context. We will adopt a similar approach for responses of a mixed data type.  相似文献   

16.
Cornstarch/sorghum flour (X1) ratio, water added (X2) and amount of hydroxypropyl methylcellulose (HPMC) used (X3) were varied for making gluten-free bread so as to optimize batter softness (Y1), specific volume (Y2) and crumb grain (Y3). A second-order model was employed to generate a response surface. It was found that the softness of the batter depends significantly on three factors in a linear way. The specific volume (Y2), in particular, was increased significantly with the increment of X1 and X3. The crumb grain (Y3) depended significantly on three factors, its scores increased with X1 and decreased with the water added (X2). Finally, 0.55 cornstarch/sorghum flour ratio, 90% of water added and 3% of HPMC were chosen as the best conditions, considering acceptable levels of specific volume and of crumb grain, and also taking into account the possibility of using the highest proportion of sorghum flour.  相似文献   

17.
Authors have proposed new methodology in recent years for evaluating the improvement in prediction performance gained by adding a new predictor, Y, to a risk model containing a set of baseline predictors, X, for a binary outcome D. We prove theoretically that null hypotheses concerning no improvement in performance are equivalent to the simple null hypothesis that Y is not a risk factor when controlling for X, H0 : P(D = 1 | X,Y ) = P(D = 1 | X). Therefore, testing for improvement in prediction performance is redundant if Y has already been shown to be a risk factor. We also investigate properties of tests through simulation studies, focusing on the change in the area under the ROC curve (AUC). An unexpected finding is that standard testing procedures that do not adjust for variability in estimated regression coefficients are extremely conservative. This may explain why the AUC is widely considered insensitive to improvements in prediction performance and suggests that the problem of insensitivity has to do with use of invalid procedures for inference rather than with the measure itself. To avoid redundant testing and use of potentially problematic methods for inference, we recommend that hypothesis testing for no improvement be limited to evaluation of Y as a risk factor, for which methods are well developed and widely available. Analyses of measures of prediction performance should focus on estimation rather than on testing for no improvement in performance. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

18.
Common sensitivity analysis methods for unmeasured confounders provide a corrected point estimate of causal effect for each specified set of unknown parameter values. This article reviews alternative methods for generating deterministic nonparametric bounds on the magnitude of the causal effect using linear programming methods and potential outcomes models. The bounds are generated using only the observed table. We then demonstrate how these bound widths may be reduced through assumptions regarding the potential outcomes under various exposure regimens. We illustrate this linear programming approach using data from the Cooperative Cardiovascular Project. These bounds on causal effect under uncontrolled confounding complement standard sensitivity analyses by providing a range within which the causal effect must lie given the validity of the assumptions.  相似文献   

19.
For a continuous outcome in a two‐arm trial that satisfies normal distribution assumptions, we can transform the standardized mean difference with the use of the cumulative distribution function to be the effect size measure P(X < Y ). This measure is already established within engineering as the reliability parameter in stress–strength models, where Y represents the strength of a component and X represents the stress the component undergoes. If X is greater than Y, then the component will fail. In this paper, we consider the closely related effect size measure, This measure is also known as Somer's d, which was introduced by Somers in 1962 as an ordinal measure of association. In this paper, we explore this measure as a treatment effect size for a continuous outcome. Although the point estimates for λ are easily calculated, the interval is not so readily obtained. We compare kernel density estimation and use of bootstrap and jackknife methods to estimate confidence intervals against two further methods for estimating P(X < Y ) and their respective intervals, one of which makes no assumption about the underlying distribution and the other assumes a normal distribution. Simulations show that the choice of the best estimator depends on the value of λ, the variability within the data, and the underlying distribution of the data. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

20.
We examine the properties of principal scores methods to estimate the causal marginal odds ratio of an intervention for compliers in the context of a randomized controlled trial with non‐compliers. The two‐stage estimation approach has been proposed for a linear model by Jo and Stuart (Statistics in Medicine 2009; 28 :2857–2875) under a principal ignorability (PI) assumption. Using a Monte Carlo simulation study, we compared the performance of several strategies to build and use principal score models and the robustness of the method to violations of underlying assumptions, in particular PI. Results showed that the principal score approach yielded unbiased estimates of the causal marginal log odds ratio under PI but that the method was sensitive to violations of PI, which occurs in particular when confounders are omitted from the analysis. For principal score analysis, probability weighting performed slightly better than full matching or 1:1 matching. Concerning the variables to be included in principal score models, the lowest mean squared error was generally obtained when using the true confounders. Using variables associated with the outcome only but not compliance however yielded very similar performance. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号