Multi-stage filtering for improving confidence level and determining dominant clusters in clustering algorithms of gene expression data |
| |
Authors: | Shahreen Kasim Safaai Deris Razib M. Othman |
| |
Affiliation: | 1. Software Multimedia Center, Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, 86400 Parit Raja, Batu Pahat, Malaysia;2. Laboratory of Computational Intelligence and Biotechnology, Faculty of Computing, Universiti Teknologi Malaysia, 81310 UTM Skudai, Malaysia;3. Artificial Intelligence and Bioinformatics Research Group, Faculty of Computing, Universiti Teknologi Malaysia, 81310 UTM Skudai, Malaysia |
| |
Abstract: | A drastic improvement in the analysis of gene expression has lead to new discoveries in bioinformatics research. In order to analyse the gene expression data, fuzzy clustering algorithms are widely used. However, the resulting analyses from these specific types of algorithms may lead to confusion in hypotheses with regard to the suggestion of dominant function for genes of interest. Besides that, the current fuzzy clustering algorithms do not conduct a thorough analysis of genes with low membership values. Therefore, we present a novel computational framework called the “multi-stage filtering-Clustering Functional Annotation” (msf-CluFA) for clustering gene expression data. The framework consists of four components: fuzzy c-means clustering (msf-CluFA-0), achieving dominant cluster (msf-CluFA-1), improving confidence level (msf-CluFA-2) and combination of msf-CluFA-0, msf-CluFA-1 and msf-CluFA-2 (msf-CluFA-3). By employing double filtering in msf-CluFA-1 and apriori algorithms in msf-CluFA-2, our new framework is capable of determining the dominant clusters and improving the confidence level of genes with lower membership values by means of which the unknown genes can be predicted. |
| |
Keywords: | Confidence level Dominant cluster Fuzzy clustering Gene expression |
本文献已被 ScienceDirect 等数据库收录! |
|