首页 | 本学科首页   官方微博 | 高级检索  
     

A data structure and function classification based method to evaluate clustering models for gene expression data
引用本文:易东 杨梦苏 黄明辉 李辉智 王文昌. A data structure and function classification based method to evaluate clustering models for gene expression data[J]. 中国人民解放军军医大学学报, 2002, 17(4): 312-317
作者姓名:易东 杨梦苏 黄明辉 李辉智 王文昌
作者单位:[1]DepartmentofMedicalStatistics,ThirdMilitaryMedicalUniversity,Chongqing400038,China [2]AppliedResearchCentreforGenomicsTechnology,DepartmentofBiology~Chemistry,CityUniversityofHongKong,83TatCheeAvenue,Kowloon,HongKong,China [3]DepartmentofElectronicTechnology,SouthwestUniversityofPoliticsandLawScience,Chongqing400031,China
摘    要:Objective: To establish a systematic framework for selecting the best clustering algorithm and provide an evaluation method for clustering analyses of gene expression data. Metlaods: Based on data struc-ture (internal information) and function classification (external information), the evaluation of gene expres-sion data analyses were carried out by using 2 approaches. Firstly, to assess the predictive power of cluster-ing algorithms, Entropy was introduced to measure the consistency between the clustering results from differ-ent algorithms and the known and validated functional classifications. Secondly, a modified method of figureof merit (adjust_FOM) was used as internal assessment method. In this method, one clustering algorithm was used to analyze all data but one experimental condition, the remaining condition was used to assess the predictive power of the resulting clusters. This method was applied on 3 gene expression data sets (2 from the Lyer‘‘s Serum Data Sets, and 1 from the Ferea‘‘s Saccharomyces Cerevisiae Data Set). Results: A method based on entropy and figure of merit (FOM) was proposed to explore the results of the 3 data sets obtained by 6 different algorithms, SOM and Fuzzy clustering methods were confirmed to possess the highest ability to cluster. Conclusion: A method based on entropy is firstly brought forward to evaluate clustering analyses. Different results are attained in evaluating same data set due to different function classification. According to the curves of adjust-FOM and Entropy-FOM, SOM and Fuzzy clustering methods show the highest ability to cluster on the 3 data sets.

关 键 词:数据结构 数据功能 分类 模型 基因表达

A data structure and function classification based method to evaluate clustering models for gene expression data
YI Dong,YANG Meng-su,Huang Ming-hui,LI Hui-zhi,WANG Wen-chang. A data structure and function classification based method to evaluate clustering models for gene expression data[J]. Journal of Medical Colleges of PLA(China), 2002, 17(4): 312-317
Authors:YI Dong  YANG Meng-su  Huang Ming-hui  LI Hui-zhi  WANG Wen-chang
Abstract:Objective: To establish a systematic framework for selecting the best clustering algorithm and provide an evaluation method for clustering analyses of gene expression data. Methods: Based on data structure (internal information) and function classification (external information), the evaluation of gene expression data analyses were carried out by using 2 approaches. Firstly, to assess the predictive power of clustering algorithms, Entropy was introduced to measure the consistency between the clustering results from different algorithms and the known and validated functional classifications. Secondly, a modified method of figure of merit (adjust-FOM) was used as internal assessment method. In this method, one clustering algorithm was used to analyze all data but one experimental condition, the remaining condition was used to assess the predictive power of the resulting clusters. This method was applied on 3 gene expression data sets (2 from the Lyer's Serum Data Sets, and 1 from the Ferea's Saccharomyces Cerevisiae Data Set). Results: A method based on entropy and figure of merit (FOM) was proposed to explore the results of the 3 data sets obtained by 6 different algorithms, SOM and Fuzzy clustering methods were confirmed to possess the highest ability to cluster. Conclusion: A method based on entropy is firstly brought forward to evaluate clustering analyses. Different results are attained in evaluating same data set due to different function classification. According to the curves of adjust-FOM and Entropy-FOM, SOM and Fuzzy clustering methods show the highest ability to cluster on the 3 data sets.
Keywords:gene expression  evaluation of clustering  adjust- FOM  entropy
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号