Abstract: | Identifying the pathways that are significantly impacted in a given condition is a crucial step in understanding the underlying biological phenomena. All approaches currently available for this purpose calculate a P-value that aims to quantify the significance of the involvement of each pathway in the given phenotype. These P-values were previously thought to be independent. Here we show that this is not the case, and that many pathways can considerably affect each other''s P-values through a “crosstalk” phenomenon. Although it is intuitive that various pathways could influence each other, the presence and extent of this phenomenon have not been rigorously studied and, most importantly, there is no currently available technique able to quantify the amount of such crosstalk. Here, we show that all three major categories of pathway analysis methods (enrichment analysis, functional class scoring, and topology-based methods) are severely influenced by crosstalk phenomena. Using real pathways and data, we show that in some cases pathways with significant P-values are not biologically meaningful, and that some biologically meaningful pathways with nonsignificant P-values become statistically significant when the crosstalk effects of other pathways are removed. We describe a technique able to detect, quantify, and correct crosstalk effects, as well as identify independent functional modules. We assessed this novel approach on data from four experiments involving three phenotypes and two species. This method is expected to allow a better understanding of individual experiment results, as well as a more refined definition of the existing signaling pathways for specific phenotypes.The correct identification of the signaling and metabolic pathways involved in a given phenotype is a crucial step in the interpretation of high-throughput genomic experiments. Most approaches currently available for this purpose treat the pathways as independent. In fact, pathways can affect each other''s P-values through a phenomenon we refer to as crosstalk. This crosstalk may be due to the regulatory interactions among different pathways or to the gene overlap among pathways. In this work, we will use the term crosstalk to refer to the effect that pathways exercise on each other due to the presence of overlapping genes. Although it is intuitive that various pathways could influence each other, especially when they share genes, the presence and extent of this phenomenon have not been rigorously studied and, most importantly, there is no currently available technique able to quantify the amount of such crosstalk. There are three major categories of methods that aim to identify significant pathways: enrichment analysis (e.g., Fisher''s exact test–hypergeometric) (Tavazoie et al. 1999; Draghici et al. 2003); functional scoring (e.g., GSEA) (Mootha et al. 2003; Subramanian et al. 2005); and topology-based methods (e.g., impact analysis) (Draghici et al. 2007; Tarca et al. 2009). Another classification of gene set analysis methods is based on the definition of the null hypothesis and divides the methods into competitive and self-contained (Goeman and Bühlmann 2007; Nam and Kim 2008). In this work, we focus on competitive methods, and in particular on the Fisher''s exact test, although the problems identified likely apply also for self-contained methods.Here we show that the results of all these methods are affected by crosstalk effects and that this phenomenon is related to the structure of the pathways. We propose the first approach that can (1) detect crosstalk when it exists, (2) quantify its magnitude, (3) correct for it, resulting in a more meaningful ranking among pathways in a specific biological condition, and (4) identify novel functional modules that can play an independent role and have different functions than the pathway they are currently located on. This method is expected to allow a better understanding of individual experiment results, as well as a more refined definition of the existing signaling pathways for specific phenotypes. |