首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 11 毫秒
1.
This tutorial discusses important statistical problems arising in clinical trials with multiple clinical objectives based on different clinical variables, evaluation of several doses or regiments of a new treatment, analysis of multiple patient subgroups, etc. Simultaneous assessment of several objectives in a single trial gives rise to multiplicity. If unaddressed, problems of multiplicity can undermine integrity of statistical inferences. The tutorial reviews key concepts in multiple hypothesis testing and introduces main classes of methods for addressing multiplicity in a clinical trial setting. General guidelines for the development of relevant and efficient multiple testing procedures are presented on the basis of application‐specific clinical and statistical information. Case studies with common multiplicity problems are used to motivate and illustrate the statistical methods presented in the tutorial, and software implementation of the multiplicity adjustment methods is discussed. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

2.
During the last decade, many novel approaches for addressing multiplicity problems arising in clinical trials have been introduced in the literature. These approaches provide great flexibility in addressing given clinical trial objectives and yet maintain strong control of the familywise error rate. In this tutorial article, we review multiple testing strategies that are related to the following: (a) recycling local significance levels to test hierarchically ordered hypotheses; (b) adapting the significance level for testing a hypothesis to the findings of testing previous hypotheses within a given test sequence, also in view of certain consistency requirements; (c) grouping hypotheses into hierarchical families of hypotheses along with recycling the significance level between those families; and (d) graphical methods that permit repeated recycling of the significance level. These four different methodologies are related to each other, and we point out some connections as we describe and illustrate them. By contrasting the main features of these approaches, our objective is to help practicing statisticians to select an appropriate method for their applications. In this regard, we discuss how to apply some of these strategies to clinical trial settings and provide algorithms to calculate critical values and adjusted p‐values for their use in practice. The methods are illustrated with several numerical examples. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

3.
Seamless phase II/III clinical trials in which an experimental treatment is selected at an interim analysis have been the focus of much recent research interest. Many of the methods proposed are based on the group sequential approach. This paper considers designs of this type in which the treatment selection can be based on short‐term endpoint information for more patients than have primary endpoint data available. We show that in such a case, the familywise type I error rate may be inflated if previously proposed group sequential methods are used and the treatment selection rule is not specified in advance. A method is proposed to avoid this inflation by considering the treatment selection that maximises the conditional error given the data available at the interim analysis. A simulation study is reported that illustrates the type I error rate inflation and compares the power of the new approach with two other methods: a combination testing approach and a group sequential method that does not use the short‐term endpoint data, both of which also strongly control the type I error rate. The new method is also illustrated through application to a study in Alzheimer's disease. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.  相似文献   

4.
Subgroup analyses are an essential part of fully understanding the complete results from confirmatory clinical trials. However, they come with substantial methodological challenges. In case no statistically significant overall treatment effect is found in a clinical trial, this does not necessarily indicate that no patients will benefit from treatment. Subgroup analyses could be conducted to investigate whether a treatment might still be beneficial for particular subgroups of patients. Assessment of the level of evidence associated with such subgroup findings is primordial as it may form the basis for performing a new clinical trial or even drawing the conclusion that a specific patient group could benefit from a new therapy. Previous research addressed the overall type I error and the power associated with a single subgroup finding for continuous outcomes and suitable replication strategies. The current study aims at investigating two scenarios as part of a nonconfirmatory strategy in a trial with dichotomous outcomes: (a) when a covariate of interest is represented by ordered subgroups, eg, in case of biomarkers, and thus, a trend can be studied that may reflect an underlying mechanism, and (b) when multiple covariates, and thus multiple subgroups, are investigated at the same time. Based on simulation studies, this paper assesses the credibility of subgroup findings in overall nonsignificant trials and provides practical recommendations for evaluating the strength of evidence of subgroup findings in these settings.  相似文献   

5.
In a previous paper we studied a two‐stage group sequential procedure (GSP) for testing primary and secondary endpoints where the primary endpoint serves as a gatekeeper for the secondary endpoint. We assumed a simple setup of a bivariate normal distribution for the two endpoints with the correlation coefficient ρ between them being either an unknown nuisance parameter or a known constant. Under the former assumption, we used the least favorable value of ρ = 1 to compute the critical boundaries of a conservative GSP. Under the latter assumption, we computed the critical boundaries of an exact GSP. However, neither assumption is very practical. The ρ = 1 assumption is too conservative resulting in loss of power, whereas the known ρ assumption is never true in practice. In this part I of a two‐part paper on adaptive extensions of this two‐stage procedure (part II deals with sample size re‐estimation), we propose an intermediate approach that uses the sample correlation coefficient r from the first‐stage data to adaptively adjust the secondary boundary after accounting for the sampling error in r via an upper confidence limit on ρ by using a method due to Berger and Boos. We show via simulation that this approach achieves 5–11% absolute secondary power gain for ρ ≤0.5. The preferred boundary combination in terms of high primary as well as secondary power is that of O'Brien and Fleming for the primary and of Pocock for the secondary. The proposed approach using this boundary combination achieves 72–84% relative secondary power gain (with respect to the exact GSP that assumes known ρ). We give a clinical trial example to illustrate the proposed procedure. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

6.
We address design of two‐stage clinical trials comparing experimental and control patients. Our end point is success or failure, however measured, with null hypothesis that the chance of success in both arms is p 0 and alternative that it is p 0 among controls and p 1 > p 0 among experimental patients. Standard rules will have the null hypothesis rejected when the number of successes in the (E)xperimental arm, E , sufficiently exceeds C , that among (C)ontrols. Here, we combine one‐sample rejection decision rules, , with two‐sample rules of the form E  ? C  > r to achieve two‐sample tests with low sample number and low type I error. We find designs with sample numbers not far from the minimum possible using standard two‐sample rules, but with type I error of 5% rather than 15% or 20% associated with them, and of equal power. This level of type I error is achieved locally, near the stated null, and increases to 15% or 20% when the null is significantly higher than specified. We increase the attractiveness of these designs to patients by using 2:1 randomization. Examples of the application of this new design covering both high and low success rates under the null hypothesis are provided. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

7.
Comparing several treatments with a control is a common objective of clinical studies. However, existing procedures mainly deal with particular families of inferences in which all hypotheses are either one- or two-sided. In this article, we seek to develop a procedure which copes with a more general testing environment in which the family of inferences is composed of a mixture of one- and two-sided hypotheses. The proposed procedure provides a more flexible and powerful tool than the existing method. The superiority of this method is also substantiated by a simulation study of average power. Selected critical values are tabulated for the implementation of the proposed procedure. Finally, we provide an illustrative example with sample data extracted from a medical experiment.  相似文献   

8.
The increase in incidence of obesity and chronic diseases and their health care costs have raised the importance of quality diet on the health policy agendas. The healthy eating index is an important measure for diet quality which consists of 12 components derived from ratios of dependent variables with distributions hard to specify, measurement errors and excessive zero observations difficult to model parametrically. Hypothesis testing involving data of such nature poses challenges because the widely used multiple comparison procedures such as Hotelling's T2 test and Bonferroni correction may suffer from substantial loss of efficiency. We propose a marginal rank‐based inverse normal transformation approach to normalizing the marginal distribution of the data before employing a multivariate test procedure. Extensive simulation was conducted to demonstrate the ability of the proposed approach to adequately control the type I error rate as well as increase the power of the test, with data particularly from non‐symmetric or heavy‐tailed distributions. The methods are exemplified with data from a dietary intervention study for type I diabetic children. Published 2016. This article is a U.S. Government work and is in the public domain in the USA  相似文献   

9.
Shun Z  Chi E  Durrleman S  Fisher L 《Statistics in medicine》2005,24(11):1619-37; discussion 1639-56
As a regulatory strategy, it is nowadays not uncommon to conduct one confirmatory pivotal clinical trial, instead of two, to demonstrate efficacy and safety in drug development. This paper is intended to investigate the statistical foundation of such an approach. The one-study approach is compared with the conventional two-study approach in terms of power, type-I error, and fundamental statistical assumptions. Necessary requirements for a single-study model is provided in order to maintain equivalent evidence as that from a two-study model. In general, one-study model is valid only under a 'one population' assumption. In addition, higher data quality and more convincing and robust results need to be demonstrated in such cases. However, when 'one-population' assumption is valid and appropriate methods are selected, a one-study model can have a better power using the same sample size. The paper also investigates statistical assumptions and methods for making an overall inference when a two-study model has been used. The methods for integrated analysis are evaluated. It is important for statisticians to select correct pooling strategy based on the project objective and statistical hypothesis.  相似文献   

10.
Multiple significance testing involving multiple phenotypes is not uncommon in the context of gene association studies but has remained largely unaddressed. If no adjustment is made for the multiple tests conducted, the type I error probability will exceed the nominal (per test) alpha level. Nevertheless, many investigators do not implement such adjustments. This may, in part, be because most available methods for adjusting the alpha rate either: 1) do not take the correlation structure among the variables into account and, therefore, tend to be overly stringent; or 2) do not allow statements to be made about specific variables but only about multivariate composites of variables. In this paper we develop a simulation-based method and computer program that holds the actual alpha rate to the nominal alpha rate but takes the correlation structure into account. We show that this method is more powerful than several common alternative approaches and that this power advantage increases as the number of variables and their intercorrelations increase. The method appears robust to marked non-normality and variance heterogeneity even with unequal numbers of subjects in each group. The fact that gene association studies with biallelic loci will have (at most) three groups (i.e., AA, Aa, aa) implies by the closure principle that, after detection of a significant result for a specific variable, pairwise comparisons for that variable can be conducted without further adjustment of the alpha level. Genet. Epidemiol. 15:87–101,1998. © 1998 Wiley-Liss, Inc.  相似文献   

11.
Between-group comparison based on the restricted mean survival time (RMST) is getting attention as an alternative to the conventional logrank/hazard ratio approach for time-to-event outcomes in randomized controlled trials (RCTs). The validity of the commonly used nonparametric inference procedure for RMST has been well supported by large sample theories. However, we sometimes encounter cases with a small sample size in practice, where we cannot rely on the large sample properties. Generally, the permutation approach can be useful to handle these situations in RCTs. However, a numerical issue arises when implementing permutation tests for difference or ratio of RMST from two groups. In this article, we discuss the numerical issue and consider six permutation methods for comparing survival time distributions between two groups using RMST in RCTs setting. We conducted extensive numerical studies and assessed type I error rates of these methods. Our numerical studies demonstrated that the inflation of the type I error rate of the asymptotic methods is not negligible when sample size is small, and that all of the six permutation methods are workable solutions. Although some permutation methods became a little conservative, no remarkable inflation of the type I error rates were observed. We recommend using permutation tests instead of the asymptotic tests, especially when the sample size is less than 50 per arm.  相似文献   

12.
Dmitrienko et al. (Statist. Med. 2007; 26:2465-2478) proposed a tree gatekeeping procedure for testing logically related hypotheses in hierarchically ordered families, which uses weighted Bonferroni tests for all intersection hypotheses in a closure method by Marcus et al. (Biometrika 1976; 63:655-660). An algorithm was given to assign weights to the hypotheses for every intersection. The purpose of this note is to show that any weight assignment algorithm that satisfies a set of sufficient conditions can be used in this procedure to guarantee gatekeeping and independence properties. The algorithm used in Dmitrienko et al. (Statist. Med. 2007; 26:2465-2478) may fail to meet one of the conditions, namely monotonicity of weights, which may cause it to violate the gatekeeping property. An example is given to illustrate this phenomenon. A modification of the algorithm is shown to rectify this problem.  相似文献   

13.
The multiplicity problem has become increasingly important in genetic studies as the capacity for high-throughput genotyping has increased. The control of False Discovery Rate (FDR) (Benjamini and Hochberg. [1995] J. R. Stat. Soc. Ser. B 57:289-300) has been adopted to address the problems of false positive control and low power inherent in high-volume genome-wide linkage and association studies. In many genetic studies, there is often a natural stratification of the m hypotheses to be tested. Given the FDR framework and the presence of such stratification, we investigate the performance of a stratified false discovery control approach (i.e. control or estimate FDR separately for each stratum) and compare it to the aggregated method (i.e. consider all hypotheses in a single stratum). Under the fixed rejection region framework (i.e. reject all hypotheses with unadjusted p-values less than a pre-specified level and then estimate FDR), we demonstrate that the aggregated FDR is a weighted average of the stratum-specific FDRs. Under the fixed FDR framework (i.e. reject as many hypotheses as possible and meanwhile control FDR at a pre-specified level), we specify a condition necessary for the expected total number of true positives under the stratified FDR method to be equal to or greater than that obtained from the aggregated FDR method. Application to a recent Genome-Wide Association (GWA) study by Maraganore et al. ([2005] Am. J. Hum. Genet. 77:685-693) illustrates the potential advantages of control or estimation of FDR by stratum. Our analyses also show that controlling FDR at a low rate, e.g. 5% or 10%, may not be feasible for some GWA studies.  相似文献   

14.
This article gives an overview of sample size calculations for parallel group and cross-over studies with Normal data. Sample size derivation is given for trials where the objective is to demonstrate: superiority, equivalence, non-inferiority, bioequivalence and estimation to a given precision, for different types I and II errors. It is demonstrated how the different trial objectives influence the null and alternative hypotheses of the trials and how these hypotheses influence the calculations. Sample size tables for the different types of trials and worked examples are given.  相似文献   

15.
Consider a parallel group trial for the comparison of an experimental treatment to a control, where the second‐stage sample size may depend on the blinded primary endpoint data as well as on additional blinded data from a secondary endpoint. For the setting of normally distributed endpoints, we demonstrate that this may lead to an inflation of the type I error rate if the null hypothesis holds for the primary but not the secondary endpoint. We derive upper bounds for the inflation of the type I error rate, both for trials that employ random allocation and for those that use block randomization. We illustrate the worst‐case sample size reassessment rule in a case study. For both randomization strategies, the maximum type I error rate increases with the effect size in the secondary endpoint and the correlation between endpoints. The maximum inflation increases with smaller block sizes if information on the block size is used in the reassessment rule. Based on our findings, we do not question the well‐established use of blinded sample size reassessment methods with nuisance parameter estimates computed from the blinded interim data of the primary endpoint. However, we demonstrate that the type I error rate control of these methods relies on the application of specific, binding, pre‐planned and fully algorithmic sample size reassessment rules and does not extend to general or unplanned sample size adjustments based on blinded data. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.  相似文献   

16.
In this part II of the paper on adaptive extensions of a two‐stage group sequential procedure (GSP) for testing primary and secondary endpoints, we focus on the second stage sample size re‐estimation based on the first stage data. First, we show that if we use the Cui–Huang–Wang statistics at the second stage, then we can use the same primary and secondary boundaries as for the original procedure (without sample size re‐estimation) and still control the type I familywise error rate. This extends their result for the single endpoint case. We further show that the secondary boundary can be sharpened in this case by taking the unknown correlation coefficient ρ between the primary and secondary endpoints into account through the use of the confidence limit method proposed in part I of this paper. If we use the sufficient statistics instead of the CHW statistics, then we need to modify both the primary and secondary boundaries; otherwise, the error rate can get inflated. We show how to modify the boundaries of the original group sequential procedure to control the familywise error rate. We provide power comparisons between competing procedures. We illustrate the procedures with a clinical trial example. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

17.
The U.S. Food and Drug Administration (FDA) Modernization Act of 1997 has a Section (No. 112) entitled 'Expediting Study and Approval of Fast Track Drugs' (the Act). In 1998, the FDA issued a 'Guidance for Industry: the Fast Track Drug Development Programs' (the FTDD programmes) to meet the requirement of the Act. The purpose of FTDD programmes is to 'facilitate the development and expedite the review of new drugs that are intended to treat serious or life-threatening conditions and that demonstrate the potential to address unmet medical needs'. Since then many health products have reached patients who suffered from AIDS, cancer, osteoporosis, and many other diseases, sooner by utilizing the Fast Track Act and the FTDD programmes. In the meantime several scientific issues have also surfaced when following the FTDD programmes. In this paper we will discuss the concept of two kinds of type I errors, namely, the 'conditional approval' and the 'final approval' type I errors, and propose statistical methods for controlling them in a new drug submission process.  相似文献   

18.
This paper discusses a new class of multiple testing procedures, tree-structured gatekeeping procedures, with clinical trial applications. These procedures arise in clinical trials with hierarchically ordered multiple objectives, for example, in the context of multiple dose-control tests with logical restrictions or analysis of multiple endpoints. The proposed approach is based on the principle of closed testing and generalizes the serial and parallel gatekeeping approaches developed by Westfall and Krishen (J. Statist. Planning Infer. 2001; 99:25-41) and Dmitrienko et al. (Statist. Med. 2003; 22:2387-2400). The proposed testing methodology is illustrated using a clinical trial with multiple endpoints (primary, secondary and tertiary) and multiple objectives (superiority and non-inferiority testing) as well as a dose-finding trial with multiple endpoints.  相似文献   

19.
This paper presents a simple procedure for clinical trials comparing several arms with control. Demand for streamlining the evaluation of new treatments has led to phase III clinical trials with more arms than would have been used in the past. In such a setting, it is reasonable that some arms may not perform as well as an active control. We introduce a simple procedure that takes advantage of negative results in some comparisons to lessen the required strength of evidence for other comparisons. We evaluate properties analytically and use them to support claims made about multi‐arm multi‐stage designs. Published 2014. This article is a U.S. Government work and is in the public domain in the USA.  相似文献   

20.
When simultaneously testing multiple hypotheses, the usual approach in the context of confirmatory clinical trials is to control the familywise error rate (FWER), which bounds the probability of making at least one false rejection. In many trial settings, these hypotheses will additionally have a hierarchical structure that reflects the relative importance and links between different clinical objectives. The graphical approach of Bretz et al (2009) is a flexible and easily communicable way of controlling the FWER while respecting complex trial objectives and multiple structured hypotheses. However, the FWER can be a very stringent criterion that leads to procedures with low power, and may not be appropriate in exploratory trial settings. This motivates controlling generalized error rates, particularly when the number of hypotheses tested is no longer small. We consider the generalized familywise error rate (k-FWER), which is the probability of making k or more false rejections, as well as the tail probability of the false discovery proportion (FDP), which is the probability that the proportion of false rejections is greater than some threshold. We also consider asymptotic control of the false discovery rate, which is the expectation of the FDP. In this article, we show how to control these generalized error rates when using the graphical approach and its extensions. We demonstrate the utility of the resulting graphical procedures on three clinical trial case studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号