首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
In randomized clinical trials where time‐to‐event is the primary outcome, almost routinely, the logrank test is prespecified as the primary test and the hazard ratio is used to quantify treatment effect. If the ratio of 2 hazard functions is not constant, the logrank test is not optimal and the interpretation of hazard ratio is not obvious. When such a nonproportional hazards case is expected at the design stage, the conventional practice is to prespecify another member of weighted logrank tests, eg, Peto‐Prentice‐Wilcoxon test. Alternatively, one may specify a robust test as the primary test, which can capture various patterns of difference between 2 event time distributions. However, most of those tests do not have companion procedures to quantify the treatment difference, and investigators have fallen back on reporting treatment effect estimates not associated with the primary test. Such incoherence in the “test/estimation” procedure may potentially mislead clinicians/patients who have to balance risk‐benefit for treatment decision. To address this, we propose a flexible and coherent test/estimation procedure based on restricted mean survival time, where the truncation time τ is selected data dependently. The proposed procedure is composed of a prespecified test and an estimation of corresponding robust and interpretable quantitative treatment effect. The utility of the new procedure is demonstrated by numerical studies based on 2 randomized cancer clinical trials; the test is dramatically more powerful than the logrank, Wilcoxon tests, and the restricted mean survival time–based test with a fixed τ, for the patterns of difference seen in these cancer clinical trials.  相似文献   

2.
The logrank test is optimal for testing the equality of survival distributions against a proportional hazards alternative. Under a late effects alternative, it is no longer appropriate, and one may turn to Fleming–Harrington's class of weighted logrank tests instead. In some settings, such as in preventive clinical trials where the statistical analysis has to be designed before the trial begins, it can be difficult to choose a priori between the logrank and Fleming–Harrington tests. A solution to this issue is provided. A decision rule is constructed for the problem of testing the equality of two survival distributions when the expected alternative may be one of the proportional hazards and late effects. A formula for computing the necessary sample size is obtained for this decision rule. A comprehensive simulation study is conducted to assess finite sample properties of the proposed test statistic. The proposed test improves both the logrank test and Fleming–Harrington's test for late effects. Finally, the methodology is illustrated on a data set in the field of prevention of Alzheimer's disease. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

3.
In studying adaptive treatment strategies, a natural question that is of paramount interest is whether there is any significant difference among all possible treatment strategies. When the outcome variable of interest is time‐to‐event, we propose an inverse probability weighted logrank test for testing the equivalence of a fixed set of pre‐specified adaptive treatment strategies based on data from an observational study. The weights take into account both the possible selection bias in an observational study and the fact that the same subject may be consistent with more than one treatment strategy. The asymptotic distribution of the weighted logrank statistic under the null hypothesis is obtained. We show that, in an observational study where the treatment selection probabilities need to be estimated, the estimation of these probabilities does not have an effect on the asymptotic distribution of the weighted logrank statistic, as long as the estimation of the parameters in the models for these probabilities is ‐consistent. Finite sample performance of the test is assessed via a simulation study. We also show in the simulation that the test can be pretty robust to misspecification of the models for the probabilities of treatment selection. The method is applied to analyze data on antidepressant adherence time from an observational database maintained at the Department of Veterans Affairs’ Serious Mental Illness Treatment Research and Evaluation Center. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

4.
Time-to-event analysis is frequently used in medical research to investigate potential diseasemodifying treatments in neurodegenerative diseases. Potential treatment effects are generally evaluated using the logrank test, which has optimal power and sensitivity when the treatment effect (hazard ratio) is constant over time. However, there is generally no prior information as to how the hazard ratio for the event of interest actually evolves. In these cases, the logrank test is not necessarily the most appropriate to use. When the hazard ratio is expected to decrease or increase over time, alternative statistical tests such as the Fleming-Harrington test, provide a better sensitivity. An example of this comes from a large, five-year randomised, placebo-controlled prevention trial (GuidAge) in 2854 community-based subjects making spontaneous memory complaints to their family physicians, which evaluated whether treatment with EGb761® can modify the risk of developing AD. The primary outcome measure was the time to conversion from memory complaint to Alzheimer’s type dementia. Although there was no significant difference in the hazard function of conversion between the two treatment groups according to the preplanned logrank test, a significant treatment-by-time interaction for the incidence of AD was observed in a protocol-specified subgroup analysis, suggesting that the hazard ratio is not constant over time. For this reason, additional post hoc analyses were performed using the Fleming-Harrington test to evaluate whether there was a signal of a late effect of EGb761®. Applying the Fleming-Harrington test, the hazard function for conversion to dementia in the placebo group was significantly different from that in the EGb761® treatment group (p = 0.0054), suggesting a late effect of EGb761®. Since this was a post hoc analysis, no definitive conclusions can be drawn as to the effectiveness of the treatment. This post hoc analysis illustrates the interest of performing another randomised clinical trial of EGb761® explicitly testing the hypothesis of a late treatment effect, as well as of using of better adapted statistical approaches for long term preventive trials when it is expected that prevention cannot have an immediate effect but rather a delayed effect that increases over time.  相似文献   

5.
Often the effect of at least one of the prognostic factors in a Cox regression model changes over time, which violates the proportional hazards assumption of this model. As a consequence, the average hazard ratio for such a prognostic factor is under‐ or overestimated. While there are several methods to appropriately cope with non‐proportional hazards, in particular by including parameters for time‐dependent effects, weighted estimation in Cox regression is a parsimonious alternative without additional parameters. The methodology, which extends the weighted k‐sample logrank tests of the Tarone‐Ware scheme to models with multiple, binary and continuous covariates, has been introduced in the nineties of the last century and is further developed and re‐evaluated in this contribution. The notion of an average hazard ratio is defined and its connection to the effect size measure P(X<Y) is emphasized. The suggested approach accomplishes estimation of intuitively interpretable average hazard ratios and provides tools for inference. A Monte Carlo study confirms the satisfactory performance. Advantages of the approach are exemplified by comparing standard and weighted analyses of an international lung cancer study. SAS and R programs facilitate application. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

6.
Power for time‐to‐event analyses is usually assessed under continuous‐time models. Often, however, times are discrete or grouped, as when the event is only observed when a procedure is performed. Wallenstein and Wittes (Biometrics, 1993) describe the power of the Mantel–Haenszel test for discrete lifetables under their chained binomial model for specified vectors of event probabilities over intervals of time. Herein, the expressions for these probabilities are derived under a piecewise exponential model allowing for staggered entry and losses to follow‐up. Radhakrishna (Biometrics, 1965) showed that the Mantel–Haenszel test is maximally efficient under the alternative of a constant odds ratio and derived the optimal weighted test under other alternatives. Lachin (Biostatistical Methods: The Assessment of Relative Risks, 2011) described the power function of this family of weighted Mantel–Haenszel tests. Prentice and Gloeckler (Biometrics, 1978) described a generalization of the proportional hazards model for grouped time data and the corresponding maximally efficient score test. Their test is also shown to be a weighted Mantel–Haenszel test, and its power function is likewise obtained. There is trivial loss in power under the discrete chained binomial model relative to the continuous‐time case provided that there is a modest number of periodic evaluations. Relative to the case of homogeneity of odds ratios, there can be substantial loss in power when there is substantial heterogeneity of odds ratios, especially when heterogeneity occurs early in a study when most subjects are at risk, but little loss in power when there is heterogeneity late in a study. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

7.
Clustered right‐censored data often arise from tumorigenicity experiments and clinical trials. For testing the equality of two survival functions, Jung and Jeong extended weighted logrank (WLR) tests to two independent samples of clustered right‐censored data, while the weighted Kaplan–Meier (WKM) test can be derived from the work of O'Gorman and Akritas. The weight functions in both classes of tests (WLR and WKM) can be selected to be more sensitive to detect a certain alternative; however, since the exact alternative is unknown, it is difficult to specify the selected weights in advance. Since WLR is rank‐based, it is not sensitive to the magnitude of the difference in survival times. Although WKM is constructed to be more sensitive to the magnitude of the difference in survival times, it is not sensitive to late hazard differences. Therefore, in order to combine the advantages of these two classes of tests, this paper developed a class of versatile tests based on simultaneously using WLR and WKM for two independent samples of clustered right‐censored data. The comparative results from a simulation study are presented and the implementation of the versatile tests to two real data sets is illustrated. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

8.
We propose a modified version of the standard logrank test for survival data in which the first contribution to the statistic is based on grouping of data before a pre-selected grouping time. The grouping is accomplished by artificially constructing the first table based on the product limit estimates of the proportions surviving at the grouping time. Remaining contributions to the statistic are identical to that of the standard logrank statistic. The approach has the advantage of being uninfluenced by non-proportional hazards differences prior to the grouping time while being almost as efficient as the usual logrank test for proportional-hazards alternatives. The statistic is particularly useful for interim monitoring in situations where early difference between treatments are unimportant and/or lagged treatment effects are anticipated. We report simulation studies to verify and investigate the test's properties. © 1997 by John Wiley & Sons, Ltd.  相似文献   

9.
In many clinical settings, improving patient survival is of interest but a practical surrogate, such as time to disease progression, is instead used as a clinical trial's primary endpoint. A time‐to‐first endpoint (e.g., death or disease progression) is commonly analyzed but may not be adequate to summarize patient outcomes if a subsequent event contains important additional information. We consider a surrogate outcome very generally as one correlated with the true endpoint of interest. Settings of interest include those where the surrogate indicates a beneficial outcome so that the usual time‐to‐first endpoint of death or surrogate event is nonsensical. We present a new two‐sample test for bivariate, interval‐censored time‐to‐event data, where one endpoint is a surrogate for the second, less frequently observed endpoint of true interest. This test examines whether patient groups have equal clinical severity. If the true endpoint rarely occurs, the proposed test acts like a weighted logrank test on the surrogate; if it occurs for most individuals, then our test acts like a weighted logrank test on the true endpoint. If the surrogate is a useful statistical surrogate, our test can have better power than tests based on the surrogate that naively handles the true endpoint. In settings where the surrogate is not valid (treatment affects the surrogate but not the true endpoint), our test incorporates the information regarding the lack of treatment effect from the observed true endpoints and hence is expected to have a dampened treatment effect compared with tests based on the surrogate alone. Published 2016. This article is a U.S. Government work and is in the public domain in the USA.  相似文献   

10.
Time‐to‐event data are very common in observational studies. Unlike randomized experiments, observational studies suffer from both observed and unobserved confounding biases. To adjust for observed confounding in survival analysis, the commonly used methods are the Cox proportional hazards (PH) model, the weighted logrank test, and the inverse probability of treatment weighted Cox PH model. These methods do not rely on fully parametric models, but their practical performances are highly influenced by the validity of the PH assumption. Also, there are few methods addressing the hidden bias in causal survival analysis. We propose a strategy to test for survival function differences based on the matching design and explore sensitivity of the P‐values to assumptions about unmeasured confounding. Specifically, we apply the paired Prentice‐Wilcoxon (PPW) test or the modified PPW test to the propensity score matched data. Simulation studies show that the PPW‐type test has higher power in situations when the PH assumption fails. For potential hidden bias, we develop a sensitivity analysis based on the matched pairs to assess the robustness of our finding, following Rosenbaum's idea for nonsurvival data. For a real data illustration, we apply our method to an observational cohort of chronic liver disease patients from a Mayo Clinic study. The PPW test based on observed data initially shows evidence of a significant treatment effect. But this finding is not robust, as the sensitivity analysis reveals that the P‐value becomes nonsignificant if there exists an unmeasured confounder with a small impact.  相似文献   

11.
This paper evaluates the loss of power of the simple and stratified logrank tests due to heterogeneity of patients in clinical trials and proposes a flexible and efficient method of estimating treatment effects adjusting for prognostic factors. The results of the paper are based on the analyses of survival data from a large clinical trial which includes more than 6000 cancer patients. Major findings from the simulation study on power are: (i) for a heterogeneous sample, such as advanced cancer patients, a simple logrank test can yield misleading results and should not be used; (ii) the stratified logrank test may suffer some power loss when many prognostic factors need to be considered and the number of patients within stratum is small. To address the problems due to heterogeneity, the Cox regression method with a special hazard model is recommended. We illustrate the method using data from a gastric cancer clinical trial. © 1997 by John Wiley & Sons, Ltd.  相似文献   

12.
In the course of designing a clinical trial, investigators are often faced with the possibility that only a fraction of the patients will benefit from the experimental treatment. A proper clinical trial design requires prospective specification of the testing procedures to be used in the analysis. In the absence of reliable prognostic factors capable of identifying the appropriate subset of patients, there is a need for a test procedure that will be sensitive to a range of possible fractions of responders. Focusing on survival data, we propose guidelines for selecting a proper test procedure based on the anticipated proportion of responding patients. These guidelines suggest that the logrank test should be used when the fraction of responders is expected to be greater than 0.5, otherwise procedures based on weighted linear rank tests are preferable. Overall this approach provides good power properties when the treatment affects only a small proportion of patients while protecting against substantial loss of power when all patients are affected. Use of the procedure is illustrated with data from two published randomized studies.  相似文献   

13.
Current approaches for analysis of longitudinal genetic epidemiological data of quantitative traits are typically restricted to normality assumptions of the trait. We introduce the longitudinal nonparametric test (LNPT) for cohorts with quantitative follow‐up data to test for overall main effects of genes and for gene‐gene and gene‐time interactions. The LNPT is a rank procedure and does not depend on normality assumptions of the trait. We demonstrate by simulations that the LNPT is powerful, keeps the type‐1 error level, and has very good small sample size behavior. For phenotypes with normal residuals, loss of power compared to parametric approaches (linear mixed models) was small for the quite general scenarios, which we simulated. For phenotypes with non‐normal residuals, gain in power by the LNPT can be substantial. In contrast to parametric approaches, the LNPT is invariant with respect to monotone transformations of the trait. It is mathematically valid for arbitrary trait distribution. Genet. Epidemiol. 34: 469?478, 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

14.
In the analysis of composite endpoints in a clinical trial, time to first event analysis techniques such as the logrank test and Cox proportional hazard test do not take into account the multiplicity, importance, and the severity of events in the composite endpoint. Several generalized pairwise comparison analysis methods have been described recently that do allow to take these aspects into account. These methods have the additional benefit that all types of outcomes can be included, such as longitudinal quantitative outcomes, to evaluate the full treatment effect. Four of the generalized pairwise comparison methods, ie, the Finkelstein-Schoenfeld, the Buyse, unmatched Pocock, and adapted O'Brien test, are summarized. They are compared to each other and to the logrank test by means of simulations while specifically evaluating the effect of correlation between components of the composite endpoint on the power to detect a treatment difference. These simulations show that prioritized generalized pairwise comparison methods perform very similarly, are sensitive to the priority rank of the components in the composite endpoint, and do not measure the true treatment effect from the second priority-ranked component onward. The nonprioritized pairwise comparison test does not suffer from these limitations and correlation affects only its variance.  相似文献   

15.
To compare the survival functions based on right-truncated data, Lagakos et al. proposed a weighted logrank test based on a reverse time scale. This is in contrast to Bilker and Wang, who suggested a semi-parametric version of the Mann-Whitney test by assuming that the distribution of truncation times is known or can be estimated parametrically. The approach of Lagakos et al. is simple and elegant, but the weight function in their method depends on the underlying cumulative hazard functions even under proportional hazards models. On the other hand, a semi-parametric test may have better efficiency, but it may be sensitive to misspecification of the distribution of truncation times. Therefore, this paper proposes a non-parametric test statistic based on the integrated weighted difference between two estimated survival functions in forward time. The comparative results from a simulation study are presented and the implementation of these methods to a real data set is demonstrated.  相似文献   

16.
When comparing two survival distributions with proportional hazard functions, the logrank test is optimal for testing the null hypothesis that the constant hazard ratio (relative risk) is one. In this paper, we focus on (i) testing for departures from a relative risk other than one, and (ii) estimation of the relative risk. The standard tool to address both (i) and (ii) is the Cox proportional hazards model. However, the performance of the Cox model can be less than optimal with small samples. We show why this is the case, and propose a simple alternative method of estimation and inference based on a generalized logrank (GLR) statistic. While the GLR and Cox model approaches are asymptotically similar, empirical results reveal that the GLR approach is notably more efficient than the Cox model when the number of subjects is small (< 100 subjects per treatment group). An example based on survival times of cervical cancer patients is used to illustrate the proposed methodology.  相似文献   

17.
Time-to-event outcomes are common for oncology clinical trials. Conventional methods of analysis for these endpoints include logrank or Wilcoxon tests for treatment group comparisons, Kaplan-Meier survival estimates, and Cox proportional hazards models to estimate the treatment group hazard ratio (both unadjusted and adjusted for relevant covariates). Adjusting for covariates reduces bias and may increase precision and power (Statist. Med. 2002; 21:2899-2908). However, the appropriateness of the Cox proportional hazards model depends on parametric assumptions. One way to address these issues is to use non-parametric analysis of covariance (J. Biopharm. Statist. 1999; 9:307-338). Here, we carry out simulations to investigate the type I error and power of the unadjusted and covariate-adjusted non-parametric logrank test and Wilcoxon test, and the Cox proportion hazards model. A comparison between the covariate-adjusted and unadjusted methods is also illustrated with an oncology clinical trial example.  相似文献   

18.
We consider weighted logrank tests for interval censored data when assessment times may depend on treatment, and for each individual, we only use the two assessment times that bracket the event of interest. It is known that treating finite right endpoints as observed events can substantially inflate the type I error rate under assessment–treatment dependence (ATD), but the validity of several other implementations of weighted logrank tests (score tests, permutation tests, multiple imputation tests) has not been studied in this situation. With a bounded number of unique assessment times, the score test under the grouped continuous model retains the type I error rate asymptotically under ATD; however, although the approximate permutation test based on the permutation central limit theorem is not asymptotically valid under every ATD scenario, we show through simulation that in many ATD scenarios, it retains the type I error rate better than the score test. We show a case where the approximate permutation test retains the type I error rate when the exact permutation test does not. We study and modify the multiple imputation logrank tests of Huang, Lee, and Yu (2008, Statistics in Medicine, 27: 3217–3226), showing that the distribution of the rank‐like scores asymptotically does not depend on the assessment times. We show through simulations that our modifications of the multiple imputation logrank tests retain the type I error rate in all cases studied, even with ATD and a small number of individuals in each treatment group. Simulations were performed using the interval R package. Published 2012. This article is a US Government work and is in the public domain in the USA.  相似文献   

19.
With censored event time observations, the logrank test is the most popular tool for testing the equality of two underlying survival distributions. Although this test is asymptotically distribution free, it may not be powerful when the proportional hazards assumption is violated. Various other novel testing procedures have been proposed, which generally are derived by assuming a class of specific alternative hypotheses with respect to the hazard functions. The test considered by Pepe and Fleming (1989) is based on a linear combination of weighted differences of the two Kaplan–Meier curves over time and is a natural tool to assess the difference of two survival functions directly. In this article, we take a similar approach but choose weights that are proportional to the observed standardized difference of the estimated survival curves at each time point. The new proposal automatically makes weighting adjustments empirically. The new test statistic is aimed at a one‐sided general alternative hypothesis and is distributed with a short right tail under the null hypothesis but with a heavy tail under the alternative. The results from extensive numerical studies demonstrate that the new procedure performs well under various general alternatives with a caution of a minor inflation of the type I error rate when the sample size is small or the number of observed events is small. The survival data from a recent cancer comparative study are utilized for illustrating the implementation of the process. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

20.
Cluster randomized trials (CRTs) are increasingly used to evaluate the effectiveness of health‐care interventions. A key feature of CRTs is that the observations on individuals within clusters are correlated as a result of between‐cluster variability. Sample size formulae exist which account for such correlations, but they make different assumptions regarding the between‐cluster variability in the intervention arm of a trial, resulting in different sample size estimates. We explore the relationship for binary outcome data between two common measures of between‐cluster variability: k, the coefficient of variation and ρ, the intracluster correlation coefficient. We then assess how the assumptions of constant k or ρ across treatment arms correspond to different assumptions about intervention effects. We assess implications for sample size estimation and present a simple solution to the problems outlined. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号