首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Power and sample size for DNA microarray studies   总被引:10,自引:0,他引:10  
A microarray study aims at having a high probability of declaring genes to be differentially expressed if they are truly expressed, while keeping the probability of making false declarations of expression acceptably low. Thus, in formal terms, well-designed microarray studies will have high power while controlling type I error risk. Achieving this objective is the purpose of this paper. Here, we discuss conceptual issues and present computational methods for statistical power and sample size in microarray studies, taking account of the multiple testing that is generic to these studies. The discussion encompasses choices of experimental design and replication for a study. Practical examples are used to demonstrate the methods. The examples show forcefully that replication of a microarray experiment can yield large increases in statistical power. The paper refers to cDNA arrays in the discussion and illustrations but the proposed methodology is equally applicable to expression data from oligonucleotide arrays.  相似文献   

2.
Shao Y  Tseng CH 《Statistics in medicine》2007,26(23):4219-4237
DNA microarrays have been widely used for the purpose of simultaneously monitoring a large number of gene expression levels to identify differentially expressed genes. Statistical methods for the adjustment of multiple testing have been discussed extensively in the literature. An important further challenge is the existence of dependence among test statistics due to reasons such as gene co-regulation. To plan large-scale genomic studies, sample size determination with appropriate adjustment for both multiple testing and potential dependency among test statistics is crucial to avoid an abundance of false-positive results and/or serious lack of power. We introduce a general approach for calculating sample sizes for two-way multiple comparisons in the presence of dependence among test statistics to ensure adequate overall power when the false discovery rates are controlled. The usefulness of the proposed method is demonstrated via numerical studies using both simulated data and real data from a well-known study of leukaemia.  相似文献   

3.
High-throughput screening (HTS) is a large-scale hierarchical process in which a large number of chemicals are tested in multiple stages. Conventional statistical analyses of HTS studies often suffer from high testing error rates and soaring costs in large-scale settings. This article develops new methodologies for false discovery rate control and optimal design in HTS studies. We propose a two-stage procedure that determines the optimal numbers of replicates at different screening stages while simultaneously controlling the false discovery rate in the confirmatory stage subject to a constraint on the total budget. The merits of the proposed methods are illustrated using both simulated and real data. We show that, at the expense of a limited budget, the proposed screening procedure effectively controls the error rate and the design leads to improved detection power.  相似文献   

4.
The original definitions of false discovery rate (FDR) and false non-discovery rate (FNR) can be understood as the frequentist risks of false rejections and false non-rejections, respectively, conditional on the unknown parameter, while the Bayesian posterior FDR and posterior FNR are conditioned on the data. From a Bayesian point of view, it seems natural to take into account the uncertainties in both the parameter and the data. In this spirit, we propose averaging out the frequentist risks of false rejections and false non-rejections with respect to some prior distribution of the parameters to obtain the average FDR (AFDR) and average FNR (AFNR), respectively. A linear combination of the AFDR and AFNR, called the average Bayes error rate (ABER), is considered as an overall risk. Some useful formulas for the AFDR, AFNR and ABER are developed for normal samples with hierarchical mixture priors. The idea of finding threshold values by minimizing the ABER or controlling the AFDR is illustrated using a gene expression data set. Simulation studies show that the proposed approaches are more powerful and robust than the widely used FDR method.  相似文献   

5.
The main role of high‐throughput microarrays today is screening of relevant genes from a large pool of candidate genes. For prioritizing genes for subsequent studies, gene ranking based on the strength of the association with the phenotype is a relevant statistical output. In this article, we propose sample size calculations based on gene ranking and selection using the non‐parametric Mann–Whitney–Wilcoxon statistic in microarray experiments. The use of the non‐parametric statistic is expected to be advantageous in robustification in gene ranking for the deviation from normality and for possible scale change by using different platforms such as polymerase chain reaction‐based platforms in subsequent studies in gene expression data. Application to the data set from a clinical study for lymphoma is given. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

6.
We propose a method for calculating power and sample size for studies involving interval‐censored failure time data that only involves standard software required for fitting the appropriate parametric survival model. We use the framework of a longitudinal study where patients are assessed periodically for a response and the only resultant information available to the investigators is the failure window: the time between the last negative and first positive test results. The survival model is fit to an expanded data set using easily computed weights. We illustrate with a Weibull survival model and a two‐group comparison. The investigator can specify a group difference in terms of a hazards ratio. Our simulation results demonstrate the merits of these proposed power calculations. We also explore how the number of assessments (visits), and thus the corresponding lengths of the failure intervals, affect study power. The proposed method can be easily extended to more complex study designs and a variety of survival and censoring distributions. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

7.
Although sample size calculations have become an important element in the design of research projects, such methods for studies involving current status data are scarce. Here, we propose a method for calculating power and sample size for studies using current status data. This method is based on a Weibull survival model for a two‐group comparison. The Weibull model allows the investigator to specify a group difference in terms of a hazards ratio or a failure time ratio. We consider exponential, Weibull and uniformly distributed censoring distributions. We base our power calculations on a parametric approach with the Wald test because it is easy for medical investigators to conceptualize and specify the required input variables. As expected, studies with current status data have substantially less power than studies with the usual right‐censored failure time data. Our simulation results demonstrate the merits of these proposed power calculations. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

8.
Despite our best efforts, missing outcomes are common in randomized controlled clinical trials. The National Research Council's Committee on National Statistics panel report titled The Prevention and Treatment of Missing Data in Clinical Trials noted that further research is required to assess the impact of missing data on the power of clinical trials and how to set useful target rates and acceptable rates of missing data in clinical trials. In this article, using binary responses for illustration, we establish that conclusions based on statistical analyses that include only complete cases can be seriously misleading, and that the adverse impact of missing data grows not only with increasing rates of missingness but also with increasing sample size. We illustrate how principled sensitivity analysis can be used to assess the robustness of the conclusions. Finally, we illustrate how sample sizes can be adjusted to account for expected rates of missingness. We find that when sensitivity analyses are considered as part of the primary analysis, the required adjustments to the sample size are dramatically larger than those that are traditionally used. Furthermore, in some cases, especially in large trials with small target effect sizes, it is impossible to achieve the desired power.  相似文献   

9.
Recently, Stewart and Ruberg proposed the use of contrast tests for detecting dose-response relationships. They considered in particular bivariate contrasts for healing rates and gave several possibilities of defining adequate sets of coefficients. This paper extends their work in several directions. First, asymptotic power expressions for both single and multiple contrast tests are derived. Secondly, well known trend tests are rewritten as multiple contrast tests, thus alleviating the inherent problem of choosing adequate contrast coefficients. Thirdly, recent results on the efficient calculation of multivariate normal probabilities overcome the traditional simulation-based methods for the numerical computations. Modifications of the power formulae allow the calculation of sample sizes for given type I and II errors, the spontaneous rate, and the dose-response shape. Some numerical results of a power study for small to moderate sample sizes show that the nominal power is a reasonably good approximation to the actual power. An example from a clinical trial illustrates the practical use of the results.  相似文献   

10.
The multiplicity problem has become increasingly important in genetic studies as the capacity for high-throughput genotyping has increased. The control of False Discovery Rate (FDR) (Benjamini and Hochberg. [1995] J. R. Stat. Soc. Ser. B 57:289-300) has been adopted to address the problems of false positive control and low power inherent in high-volume genome-wide linkage and association studies. In many genetic studies, there is often a natural stratification of the m hypotheses to be tested. Given the FDR framework and the presence of such stratification, we investigate the performance of a stratified false discovery control approach (i.e. control or estimate FDR separately for each stratum) and compare it to the aggregated method (i.e. consider all hypotheses in a single stratum). Under the fixed rejection region framework (i.e. reject all hypotheses with unadjusted p-values less than a pre-specified level and then estimate FDR), we demonstrate that the aggregated FDR is a weighted average of the stratum-specific FDRs. Under the fixed FDR framework (i.e. reject as many hypotheses as possible and meanwhile control FDR at a pre-specified level), we specify a condition necessary for the expected total number of true positives under the stratified FDR method to be equal to or greater than that obtained from the aggregated FDR method. Application to a recent Genome-Wide Association (GWA) study by Maraganore et al. ([2005] Am. J. Hum. Genet. 77:685-693) illustrates the potential advantages of control or estimation of FDR by stratum. Our analyses also show that controlling FDR at a low rate, e.g. 5% or 10%, may not be feasible for some GWA studies.  相似文献   

11.
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination.  相似文献   

12.
Microarrays are used increasingly to identify genes that are truly differentially expressed in tissues under different conditions. Planning such studies requires establishing a sample size that will ensure adequate statistical power. For microarray analyses, false discovery rate (FDR) is considered to be an appropriate error measure. Several FDR-controlling procedures have been developed. How these procedures perform for such analyses has not been evaluated thoroughly under realistic assumptions. In order to develop a method of determining sample sizes for these procedures, it needs to be established whether these procedures really control the FDR below the pre-specified level so that the determined sample size indeed provides adequate power. To answer this question, we first conducted simulation studies. Our simulation results showed that these procedures do control the FDR at most situations but under-control the FDR when the proportion of positive genes is small, the most likely scenarios. Thus, these existing procedures can overestimate the power and underestimate the sample size. Accordingly, we developed a simulation-based method to provide more accurate estimates for power and sample size.  相似文献   

13.
Correct selection of prognostic biomarkers among multiple candidates is becoming increasingly challenging as the dimensionality of biological data becomes higher. Therefore, minimizing the false discovery rate (FDR) is of primary importance, while a low false negative rate (FNR) is a complementary measure. The lasso is a popular selection method in Cox regression, but its results depend heavily on the penalty parameter λ. Usually, λ is chosen using maximum cross‐validated log‐likelihood (max‐cvl). However, this method has often a very high FDR. We review methods for a more conservative choice of λ. We propose an empirical extension of the cvl by adding a penalization term, which trades off between the goodness‐of‐fit and the parsimony of the model, leading to the selection of fewer biomarkers and, as we show, to the reduction of the FDR without large increase in FNR. We conducted a simulation study considering null and moderately sparse alternative scenarios and compared our approach with the standard lasso and 10 other competitors: Akaike information criterion (AIC), corrected AIC, Bayesian information criterion (BIC), extended BIC, Hannan and Quinn information criterion (HQIC), risk information criterion (RIC), one‐standard‐error rule, adaptive lasso, stability selection, and percentile lasso. Our extension achieved the best compromise across all the scenarios between a reduction of the FDR and a limited raise of the FNR, followed by the AIC, the RIC, and the adaptive lasso, which performed well in some settings. We illustrate the methods using gene expression data of 523 breast cancer patients. In conclusion, we propose to apply our extension to the lasso whenever a stringent FDR with a limited FNR is targeted. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

14.
An improved method of sample size calculation for the one‐sample log‐rank test is provided. The one‐sample log‐rank test may be the method of choice if the survival curve of a single treatment group is to be compared with that of a historic control. Such settings arise, for example, in clinical phase‐II trials if the response to a new treatment is measured by a survival endpoint. Present sample size formulas for the one‐sample log‐rank test are based on the number of events to be observed, that is, in order to achieve approximately a desired power for allocated significance level and effect the trial is stopped as soon as a certain critical number of events are reached. We propose a new stopping criterion to be followed. Both approaches are shown to be asymptotically equivalent. For small sample size, though, a simulation study indicates that the new criterion might be preferred when planning a corresponding trial. In our simulations, the trial is usually underpowered, and the aspired significance level is not exploited if the traditional stopping criterion based on the number of events is used, whereas a trial based on the new stopping criterion maintains power with the type‐I error rate still controlled. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

15.
Chu H  Nie L  Cole SR 《Statistics in medicine》2006,25(15):2647-2657
Often in randomized clinical trials and observational cohort studies, a non-negative continuously distributed response variable is measured in treatment and control groups. In the presence of true zeros for the response variable, a two-part zero-inflated log-normal model (which assumes that the data has a probability mass at zero and a continuous response for values greater than zero) is usually recommended. However, in some environmental health and human immunodeficiency virus (HIV) studies, quantitative assays for metabolites of toxicants, or quantitative HIV RNA measurements are subject to left-censoring due to values falling below the limit of detection (LD). Here, a zero-inflated log-normal mixture model is often suggested since true zeros are indistinguishable from left-censored values due to the LD. When the probabilities of true zeros in the two groups are not restricted to be equal, the information contributed by values falling below LD is used only to estimate the probability of true zeros in the context of mixture distributions. We derived the required sample size to assess the effect of a treatment in the context of mixture models with equal and unequal variances based on the left-truncated log-normal distribution. Methods for calculation of statistical power are also presented. We calculate the required sample size and power for a recent study estimating the effect of oltipraz on reducing urinary levels of the hydroxylated metabolite aflatoxin M(1) (AFM(1)) in a randomized, placebo-controlled, double-blind phase IIa chemoprevention trial in Qidong, China. A Monte Carlo simulation study is conducted to investigate the performance of the proposed methods.  相似文献   

16.
This paper presents a simple Bayesian approach to sample size determination in clinical trials. It is required that the trial should be large enough to ensure that the data collected will provide convincing evidence either that an experimental treatment is better than a control or that it fails to improve upon control by some clinically relevant difference. The method resembles standard frequentist formulations of the problem, and indeed in certain circumstances involving 'non-informative' prior information it leads to identical answers. In particular, unlike many Bayesian approaches to sample size determination, use is made of an alternative hypothesis that an experimental treatment is better than a control treatment by some specified magnitude. The approach is introduced in the context of testing whether a single stream of binary observations are consistent with a given success rate p(0). Next the case of comparing two independent streams of normally distributed responses is considered, first under the assumption that their common variance is known and then for unknown variance. Finally, the more general situation in which a large sample is to be collected and analysed according to the asymptotic properties of the score statistic is explored.  相似文献   

17.
Various methods have been described for re-estimating the final sample size in a clinical trial based on an interim assessment of the treatment effect. Many re-weight the observations after re-sizing so as to control the pursuant inflation in the type I error probability alpha. Lan and Trost (Estimation of parameters and sample size re-estimation. Proceedings of the American Statistical Association Biopharmaceutical Section 1997; 48-51) proposed a simple procedure based on conditional power calculated under the current trend in the data (CPT). The study is terminated for futility if CPT < or = CL, continued unchanged if CPT > or = CU, or re-sized by a factor m to yield CPT = CU if CL < CPT < CU, where CL and CU are pre-specified probability levels. The overall level alpha can be preserved since the reduction due to stopping for futility can balance the inflation due to sample size re-estimation, thus permitting any form of final analysis with no re-weighting. Herein the statistical properties of this approach are described including an evaluation of the probabilities of stopping for futility or re-sizing, the distribution of the re-sizing factor m, and the unconditional type I and II error probabilities alpha and beta. Since futility stopping does not allow a type I error but commits a type II error, then as the probability of stopping for futility increases, alpha decreases and beta increases. An iterative procedure is described for choice of the critical test value and the futility stopping boundary so as to ensure that specified alpha and beta are obtained. However, inflation in beta is controlled by reducing the probability of futility stopping, that in turn dramatically increases the possible re-sizing factor m. The procedure is also generalized to limit the maximum sample size inflation factor, such as at m max = 4. However, doing so then allows for a non-trivial fraction of studies to be re-sized at this level that still have low conditional power. These properties also apply to other methods for sample size re-estimation with a provision for stopping for futility. Sample size re-estimation procedures should be used with caution and the impact on the overall type II error probability should be assessed.  相似文献   

18.
Current analysis of event‐related potentials (ERP) data is usually based on the a priori selection of channels and time windows of interest for studying the differences between experimental conditions in the spatio‐temporal domain. In this work we put forward a new strategy designed for situations when there is not a priori information about ‘when’ and ‘where’ these differences appear in the spatio‐temporal domain, simultaneously testing numerous hypotheses, which increase the risk of false positives. This issue is known as the problem of multiple comparisons and has been managed with methods that control the false discovery rate (FDR), such as permutation test and FDR methods. Although the former has been previously applied, to our knowledge, the FDR methods have not been introduced in the ERP data analysis. Here we compare the performance (on simulated and real data) of permutation test and two FDR methods (Benjamini and Hochberg (BH) and local‐fdr, by Efron). All these methods have been shown to be valid for dealing with the problem of multiple comparisons in the ERP analysis, avoiding the ad hoc selection of channels and/or time windows. FDR methods are a good alternative to the common and computationally more expensive permutation test. The BH method for independent tests gave the best overall performance regarding the balance between type I and type II errors. The local‐fdr method is preferable for high dimensional (multichannel) problems where most of the tests conform to the empirical null hypothesis. Differences among the methods according to assumptions, null distributions and dimensionality of the problem are also discussed. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

19.
We present a general framework for sample size calculation in survival studies based on comparing two or more survival distributions using any one of a class of tests including the logrank test. Incorporated within this framework are the possible presence of non-uniform staggered patient entry, non-proportional hazards, loss to follow-up and treatment changes including cross-over between treatment arms. The framework is very general in nature and is based on using piecewise exponential distributions to model the survival distributions. We illustrate the use of the approach and explore its validity using simulation studies. These studies have shown that not adjusting for loss to follow-up, non-proportional hazards or cross-over can lead to significant alterations in power or equivalently, a marked effect on sample size. The approach has been implemented in the freely available program ART (for Stata). Our investigations suggest that ART is the first software to allow incorporation of all these elements. Further extensions to the methodology such as non-local alternatives for the logrank test are also considered.  相似文献   

20.
Graf AC  Bauer P 《Statistics in medicine》2011,30(14):1637-1647
We calculate the maximum type 1 error rate of the pre-planned conventional fixed sample size test for comparing the means of independent normal distributions (with common known variance) which can be yielded when sample size and allocation rate to the treatment arms can be modified in an interim analysis. Thereby it is assumed that the experimenter fully exploits knowledge of the unblinded interim estimates of the treatment effects in order to maximize the conditional type 1 error rate. The 'worst-case' strategies require knowledge of the unknown common treatment effect under the null hypothesis. Although this is a rather hypothetical scenario it may be approached in practice when using a standard control treatment for which precise estimates are available from historical data. The maximum inflation of the type 1 error rate is substantially larger than derived by Proschan and Hunsberger (Biometrics 1995; 51:1315-1324) for design modifications applying balanced samples before and after the interim analysis. Corresponding upper limits for the maximum type 1 error rate are calculated for a number of situations arising from practical considerations (e.g. restricting the maximum sample size, not allowing sample size to decrease, allowing only increase in the sample size in the experimental treatment). The application is discussed for a motivating example.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号