首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 403 毫秒
1.
The original definitions of false discovery rate (FDR) and false non-discovery rate (FNR) can be understood as the frequentist risks of false rejections and false non-rejections, respectively, conditional on the unknown parameter, while the Bayesian posterior FDR and posterior FNR are conditioned on the data. From a Bayesian point of view, it seems natural to take into account the uncertainties in both the parameter and the data. In this spirit, we propose averaging out the frequentist risks of false rejections and false non-rejections with respect to some prior distribution of the parameters to obtain the average FDR (AFDR) and average FNR (AFNR), respectively. A linear combination of the AFDR and AFNR, called the average Bayes error rate (ABER), is considered as an overall risk. Some useful formulas for the AFDR, AFNR and ABER are developed for normal samples with hierarchical mixture priors. The idea of finding threshold values by minimizing the ABER or controlling the AFDR is illustrated using a gene expression data set. Simulation studies show that the proposed approaches are more powerful and robust than the widely used FDR method.  相似文献   

2.
ObjectivesProcedures for controlling the false positive rate when performing many hypothesis tests are commonplace in health and medical studies. Such procedures, most notably the Bonferroni adjustment, suffer from the problem that error rate control cannot be localized to individual tests, and that these procedures do not distinguish between exploratory and/or data-driven testing vs. hypothesis-driven testing. Instead, procedures derived from limiting false discovery rates may be a more appealing method to control error rates in multiple tests.Study Design and SettingControlling the false positive rate can lead to philosophical inconsistencies that can negatively impact the practice of reporting statistically significant findings. We demonstrate that the false discovery rate approach can overcome these inconsistencies and illustrate its benefit through an application to two recent health studies.ResultsThe false discovery rate approach is more powerful than methods like the Bonferroni procedure that control false positive rates. Controlling the false discovery rate in a study that arguably consisted of scientifically driven hypotheses found nearly as many significant results as without any adjustment, whereas the Bonferroni procedure found no significant results.ConclusionAlthough still unfamiliar to many health researchers, the use of false discovery rate control in the context of multiple testing can provide a solid basis for drawing conclusions about statistical significance.  相似文献   

3.
Tong T  Zhao H 《Statistics in medicine》2008,27(11):1960-1972
One major goal in microarray studies is to identify genes having different expression levels across different classes/conditions. In order to achieve this goal, a study needs to have an adequate sample size to ensure the desired power. Owing to the importance of this topic, a number of approaches to sample size calculation have been developed. However, due to the cost and/or experimental difficulties in obtaining sufficient biological materials, it might be difficult to attain the required sample size. In this article, we address more practical questions for assessing power and false discovery rate (FDR) for a fixed sample size. The relationships between power, sample size and FDR are explored. We also conduct simulations and a real data study to evaluate the proposed findings.  相似文献   

4.
The genetic basis of multiple phenotypes such as gene expression, metabolite levels, or imaging features is often investigated by testing a large collection of hypotheses, probing the existence of association between each of the traits and hundreds of thousands of genotyped variants. Appropriate multiplicity adjustment is crucial to guarantee replicability of findings, and the false discovery rate (FDR) is frequently adopted as a measure of global error. In the interest of interpretability, results are often summarized so that reporting focuses on variants discovered to be associated to some phenotypes. We show that applying FDR‐controlling procedures on the entire collection of hypotheses fails to control the rate of false discovery of associated variants as well as the expected value of the average proportion of false discovery of phenotypes influenced by such variants. We propose a simple hierarchical testing procedure that allows control of both these error rates and provides a more reliable basis for the identification of variants with functional effects. We demonstrate the utility of this approach through simulation studies comparing various error rates and measures of power for genetic association studies of multiple traits. Finally, we apply the proposed method to identify genetic variants that impact flowering phenotypes in Arabidopsis thaliana, expanding the set of discoveries.  相似文献   

5.
The multiplicity problem has become increasingly important in genetic studies as the capacity for high-throughput genotyping has increased. The control of False Discovery Rate (FDR) (Benjamini and Hochberg. [1995] J. R. Stat. Soc. Ser. B 57:289-300) has been adopted to address the problems of false positive control and low power inherent in high-volume genome-wide linkage and association studies. In many genetic studies, there is often a natural stratification of the m hypotheses to be tested. Given the FDR framework and the presence of such stratification, we investigate the performance of a stratified false discovery control approach (i.e. control or estimate FDR separately for each stratum) and compare it to the aggregated method (i.e. consider all hypotheses in a single stratum). Under the fixed rejection region framework (i.e. reject all hypotheses with unadjusted p-values less than a pre-specified level and then estimate FDR), we demonstrate that the aggregated FDR is a weighted average of the stratum-specific FDRs. Under the fixed FDR framework (i.e. reject as many hypotheses as possible and meanwhile control FDR at a pre-specified level), we specify a condition necessary for the expected total number of true positives under the stratified FDR method to be equal to or greater than that obtained from the aggregated FDR method. Application to a recent Genome-Wide Association (GWA) study by Maraganore et al. ([2005] Am. J. Hum. Genet. 77:685-693) illustrates the potential advantages of control or estimation of FDR by stratum. Our analyses also show that controlling FDR at a low rate, e.g. 5% or 10%, may not be feasible for some GWA studies.  相似文献   

6.
Comparative analyses of safety/tolerability data from a typical phase III randomized clinical trial generate multiple p-values associated with adverse experiences (AEs) across several body systems. A common approach is to 'flag' any AE with a p-value less than or equal to 0.05, ignoring the multiplicity problem. Despite the fact that this approach can result in excessive false discoveries (false positives), many researchers avoid a multiplicity adjustment to curtail the risk of missing true safety signals. We propose a new flagging mechanism that significantly lowers the false discovery rate (FDR) without materially compromising the power for detecting true signals, relative to the common no-adjustment approach. Our simple two-step procedure is an enhancement of the Mehrotra-Heyse-Tukey approach that leverages the natural grouping of AEs by body systems. We use simulations to show that, on the basis of FDR and power, our procedure is an attractive alternative to the following: (i) the no-adjustment approach; (ii) a one-step FDR approach that ignores the grouping of AEs by body systems; and (iii) a recently proposed two-step FDR approach for much larger-scale settings such as genome-wide association studies. We use three clinical trial examples for illustration.  相似文献   

7.
One of main roles of omics-based association studies with high-throughput technologies is to screen out relevant molecular features, such as genetic variants, genes, and proteins, from a large pool of such candidate features based on their associations with the phenotype of interest. Typically, screened features are subject to validation studies using more established or conventional assays, where the number of evaluable features is relatively limited, so that there may exist a fixed number of features measurable by these assays. Such a limitation necessitates narrowing a feature set down to a fixed size, following an initial screening analysis via multiple testing where adjustment for multiplicity is made. We propose a two-stage screening approach to control the false discovery rate (FDR) for a feature set with fixed size that is subject to validation studies, rather than for a feature set from the initial screening analysis. Out of the feature set selected in the first stage with a relaxed FDR level, a fraction of features with most statistical significance is firstly selected. For the remaining feature set, features are selected based on biological consideration only, without regard to any statistical information, which allows evaluating the FDR level for the finally selected feature set with fixed size. Improvement of the power is discussed in the proposed two-stage screening approach. Simulation experiments based on parametric models and real microarray datasets demonstrated substantial increment in the number of screened features for biological consideration compared with the standard screening approach, allowing for more extensive and in-depth biological investigations in omics association studies.  相似文献   

8.
微阵列数据的多重比较   总被引:3,自引:2,他引:1  
目的 介绍阳性结果错误率(FDR)及相关控制方法在微阵列数据多重比较中的应用。方法 用BH、BL、BY和ALSU四种FDR控制程序比较了3226个基因在两组乳腺癌患者中的表达差异。结果 四个程序在各自实用的范围内均将FDR控制在0.05以下,检验效能由大到小的顺序为:ALSU〉BH〉BY〉BL。ALSU程序因引入m0的估计,更为合理。不仅提高了检验效能,同时又较好地控制了假阳性错误。结论 在微阵列数据的比较中必须考虑FDR的控制,同时又要考虑提高检验效能。多重比较中,控制FDR比控制总Ⅰ型错误率(FWER)检验效能高,且更为实用。  相似文献   

9.
The relative dose response (RDR) test has been used as a functional measure of whole-body stores of vitamin A in humans. We have examined the reproducibility of the RDR procedure in a population of Guatemalan adult subjects who would be expected to show a moderate prevalence of hypovitaminosis A. Fifty-one subjects were administered a standard RDR test, and the plasma samples were analyzed for retinol, tocopherol, retinol binding protein (RBP) and prealbumin (PAL). Thirty-four of the subjects underwent repeat RDR tests 7 d later. Plasma levels in fasted subjects were as follows: retinol, 1.35 +/- 0.30 mumol/L; RBP, 37.8 +/- 7.7 mg/L; PAL, 187.0 +/- 39.0 mg/L; and tocopherol, 16.6 +/- 6.2 mumol/L. RDRs ranged from -35.2% to +63.1%, with a mean of 2.6 +/- 10.4%. Overall, we observed poor within-subject reproducibility of the RDR procedure whether expressed numerically or by diagnostic classification. Moreover, in contrast to previous studies in children, we observed fewer positive RDR tests than would be expected for the population studied. Nevertheless, the mean RDR was inversely proportional to fasting retinol levels, thus confirming the validity of the biological basis of the RDR procedure in humans. Because of high intra-individual variability with this test, investigators should be cautious when using the RDR procedure in serial studies to monitor the efficacy of therapeutic interventions or subject compliance to dietary regimens.  相似文献   

10.
A central issue in genome‐wide association (GWA) studies is assessing statistical significance while adjusting for multiple hypothesis testing. An equally important question is the statistical efficiency of the GWA design as compared to the traditional sequential approach in which genome‐wide linkage analysis is followed by region‐wise association mapping. Nevertheless, GWA is becoming more popular due in part to cost efficiency: commercially available 1M chips are nearly as inexpensive as a custom‐designed 10 K chip. It is becoming apparent, however, that most of the on‐going GWA studies with 2,000–5,000 samples are in fact underpowered. As a means to improve power, we emphasize the importance of utilizing prior information such as results of previous linkage studies via a stratified false discovery rate (FDR) control. The essence of the stratified FDR control is to prioritize the genome and maintain power to interrogate candidate regions within the GWA study. These candidate regions can be defined as, but are by no means limited to, linkage‐peak regions. Furthermore, we theoretically unify the stratified FDR approach and the weighted P‐value method, and we show that stratified FDR can be formulated as a robust version of weighted FDR. Finally, we demonstrate the utility of the methods in two GWA datasets: Type 2 diabetes (FUSION) and an on‐going study of long‐term diabetic complications (DCCT/EDIC). The methods are implemented as a user‐friendly software package, SFDR. The same stratification framework can be readily applied to other type of studies, for example, using GWA results to improve the power of sequencing data analyses. Genet. Epidemiol. 34: 107–118, 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

11.
It is increasingly recognized that multiple genetic variants, within the same or different genes, combine to affect liability for many common diseases. Indeed, the variants may interact among themselves and with environmental factors. Thus realistic genetic/statistical models can include an extremely large number of parameters, and it is by no means obvious how to find the variants contributing to liability. For models of multiple candidate genes and their interactions, we prove that statistical inference can be based on controlling the false discovery rate (FDR), which is defined as the expected number of false rejections divided by the number of rejections. Controlling the FDR automatically controls the overall error rate in the special case that all the null hypotheses are true. So do more standard methods such as Bonferroni correction. However, when some null hypotheses are false, the goals of Bonferroni and FDR differ, and FDR will have better power. Model selection procedures, such as forward stepwise regression, are often used to choose important predictors for complex models. By analysis of simulations of such models, we compare a computationally efficient form of forward stepwise regression against the FDR methods. We show that model selection includes numerous genetic variants having no impact on the trait, whereas FDR maintains a false-positive rate very close to the nominal rate. With good control over false positives and better power than Bonferroni, the FDR-based methods we introduce present a viable means of evaluating complex, multivariate genetic models. Naturally, as for any method seeking to explore complex genetic models, the power of the methods is limited by sample size and model complexity.  相似文献   

12.
Current analysis of event‐related potentials (ERP) data is usually based on the a priori selection of channels and time windows of interest for studying the differences between experimental conditions in the spatio‐temporal domain. In this work we put forward a new strategy designed for situations when there is not a priori information about ‘when’ and ‘where’ these differences appear in the spatio‐temporal domain, simultaneously testing numerous hypotheses, which increase the risk of false positives. This issue is known as the problem of multiple comparisons and has been managed with methods that control the false discovery rate (FDR), such as permutation test and FDR methods. Although the former has been previously applied, to our knowledge, the FDR methods have not been introduced in the ERP data analysis. Here we compare the performance (on simulated and real data) of permutation test and two FDR methods (Benjamini and Hochberg (BH) and local‐fdr, by Efron). All these methods have been shown to be valid for dealing with the problem of multiple comparisons in the ERP analysis, avoiding the ad hoc selection of channels and/or time windows. FDR methods are a good alternative to the common and computationally more expensive permutation test. The BH method for independent tests gave the best overall performance regarding the balance between type I and type II errors. The local‐fdr method is preferable for high dimensional (multichannel) problems where most of the tests conform to the empirical null hypothesis. Differences among the methods according to assumptions, null distributions and dimensionality of the problem are also discussed. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

13.
Correct selection of prognostic biomarkers among multiple candidates is becoming increasingly challenging as the dimensionality of biological data becomes higher. Therefore, minimizing the false discovery rate (FDR) is of primary importance, while a low false negative rate (FNR) is a complementary measure. The lasso is a popular selection method in Cox regression, but its results depend heavily on the penalty parameter λ. Usually, λ is chosen using maximum cross‐validated log‐likelihood (max‐cvl). However, this method has often a very high FDR. We review methods for a more conservative choice of λ. We propose an empirical extension of the cvl by adding a penalization term, which trades off between the goodness‐of‐fit and the parsimony of the model, leading to the selection of fewer biomarkers and, as we show, to the reduction of the FDR without large increase in FNR. We conducted a simulation study considering null and moderately sparse alternative scenarios and compared our approach with the standard lasso and 10 other competitors: Akaike information criterion (AIC), corrected AIC, Bayesian information criterion (BIC), extended BIC, Hannan and Quinn information criterion (HQIC), risk information criterion (RIC), one‐standard‐error rule, adaptive lasso, stability selection, and percentile lasso. Our extension achieved the best compromise across all the scenarios between a reduction of the FDR and a limited raise of the FNR, followed by the AIC, the RIC, and the adaptive lasso, which performed well in some settings. We illustrate the methods using gene expression data of 523 breast cancer patients. In conclusion, we propose to apply our extension to the lasso whenever a stringent FDR with a limited FNR is targeted. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

14.
With recent advances in genomewide microarray technologies, whole-genome association (WGA) studies have aimed at identifying susceptibility genes for complex human diseases using hundreds of thousands of single nucleotide polymorphisms (SNPs) genotyped at the same time. In this context and to take into account multiple testing, false discovery rate (FDR)-based strategies are now used frequently. However, a critical aspect of these strAtegies is that they are applied to a collection or a family of hypotheses and, thus, critically depend on these precise hypotheses. We investigated how modifying the family of hypotheses to be tested affected the performance of FDR-based procedures in WGA studies. We showed that FDR-based procedures performed more poorly when excluding SNPs with high prior probability of being associated. Results of simulation studies mimicking WGA studies according to three scenarios are reported, and show the extent to which SNPs elimination (family contraction) prior to the analysis impairs the performance of FDR-based procedures. To illustrate this situation, we used the data from a recent WGA study on type-1 diabetes (Clayton et al. [2005] Nat. Genet. 37:1243-1246) and report the results obtained when excluding or not SNPs located inside the human leukocyte antigen region. Based on our findings, excluding markers with high prior probability of being associated cannot be recommended for the analysis of WGA data with FDR-based strategies.  相似文献   

15.
Pharmacovigilance spontaneous reporting systems are primarily devoted to early detection of the adverse reactions of marketed drugs. They maintain large spontaneous reporting databases (SRD) for which several automatic signalling methods have been developed. A common limitation of these methods lies in the fact that they do not provide an auto‐evaluation of the generated signals so that thresholds of alerts are arbitrarily chosen. In this paper, we propose to revisit the Gamma Poisson Shrinkage (GPS) model and the Bayesian Confidence Propagation Neural Network (BCPNN) model in the Bayesian general decision framework. This results in a new signal ranking procedure based on the posterior probability of null hypothesis of interest and makes it possible to derive with a non‐mixture modelling approach Bayesian estimators of the false discovery rate (FDR), false negative rate, sensitivity and specificity. An original data generation process that can be suited to the features of the SRD under scrutiny is proposed and applied to the French SRD to perform a large simulation study. Results indicate better performances according to the FDR for the proposed ranking procedure in comparison with the current ones for the GPS model. They also reveal identical performances according to the four operating characteristics for the proposed ranking procedure with the BCPNN and GPS models but better estimates when using the GPS model. Finally, the proposed procedure is applied to the French data. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

16.
Microarrays are used increasingly to identify genes that are truly differentially expressed in tissues under different conditions. Planning such studies requires establishing a sample size that will ensure adequate statistical power. For microarray analyses, false discovery rate (FDR) is considered to be an appropriate error measure. Several FDR-controlling procedures have been developed. How these procedures perform for such analyses has not been evaluated thoroughly under realistic assumptions. In order to develop a method of determining sample sizes for these procedures, it needs to be established whether these procedures really control the FDR below the pre-specified level so that the determined sample size indeed provides adequate power. To answer this question, we first conducted simulation studies. Our simulation results showed that these procedures do control the FDR at most situations but under-control the FDR when the proportion of positive genes is small, the most likely scenarios. Thus, these existing procedures can overestimate the power and underestimate the sample size. Accordingly, we developed a simulation-based method to provide more accurate estimates for power and sample size.  相似文献   

17.
Analyzing safety data from clinical trials to detect safety signals worth further examination involves testing multiple hypotheses, one for each observed adverse event (AE) type. There exists certain hierarchical structure for these hypotheses due to the classification of the AEs into system organ classes, and these AEs are also likely correlated. Many approaches have been proposed to identify safety signals under the multiple testing framework and tried to achieve control of false discovery rate (FDR). The FDR control concerns the expectation of the false discovery proportion (FDP). In practice, the control of the actual random variable FDP could be more relevant and has recently drawn much attention. In this paper, we proposed a two-stage procedure for safety signal detection with direct control of FDP, through a permutation-based approach for screening groups of AEs and a permutation-based approach of constructing simultaneous upper bounds for false discovery proportion. Our simulation studies showed that this new approach has controlled FDP. We demonstrate our approach using data sets derived from a drug clinical trial.  相似文献   

18.
19.
The methodology and results of several Vietnamese studies on the possible health effects of exposure to herbicides among the Vietnamese during the Second Indochina War are reviewed. The results of the studies appear to link either paternal or maternal exposure to herbicides to unfavorable outcomes of pregnancy. There is some evidence to suggest that the injury to reproduction diminishes over time. Two studies found statistically significant odds ratios of 4.6 and 12.0 for hydatidiform moles after exposure. One case-control study found a statistically significant odds ratio of 5.2 for liver cancer. Elevated odds ratios were also found for major externally detectable birth defects. Many of the detailed findings are in agreement with the results of animal experiments. Unfortunately, the Vietnamese do not have the resources to fully examine the health effects of phenoxy herbicides. It is our hope that recognition of the importance of the Vietnamese studies will lead to further work in this area.  相似文献   

20.
We investigated the time course and the reproducibility of the relative-dose-response (RDR) test for assessing vitamin A status in older adults. The maximum plasma retinol response to 480 retinol equivalents (RE) of retinyl palmitate in abnormal responses was at 6 or 7 h after dosing compared with the 5-h sampling interval recommended by others for younger adults and children. With respect to reproducibility, the diagnostic concordance of two RDR tests at 7-d intervals in 14 elders was 71%. In 29% of tests, one test was abnormal and the other normal. Linear regression of the two RDR values in these 14 subjects gave a correlation coefficient of -0.08. We conclude that the procedure for the RDR should be modified when applied to persons greater than 60 y of age, and that multiple repetitions of the test are needed to provide a stable indication of vitamin A stores in an elderly individual.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号