首页 | 本学科首页   官方微博 | 高级检索  
检索        


Robust prediction of protein subcellular localization combining PCA and WSVMs
Authors:Tian Jiang  Gu Hong  Liu Wenqi  Gao Chiyang
Institution:aChina Software Testing Center, China Center for Information Industry Development, Beijing 100048, China;bSchool of Control Science and Engineering, Dalian University of Technology, Dalian 116023, China
Abstract:Automated prediction of protein subcellular localization is an important tool for genome annotation and drug discovery, and Support Vector Machines (SVMs) can effectively solve this problem in a supervised manner. However, the datasets obtained from real experiments are likely to contain outliers or noises, which can lead to poor generalization ability and classification accuracy. To explore this problem, we adopt strategies to lower the effect of outliers. First we design a method based on Weighted SVMs, different weights are assigned to different data points, so the training algorithm will learn the decision boundary according to the relative importance of the data points. Second we analyse the influence of Principal Component Analysis (PCA) on WSVM classification, propose a hybrid classifier combining merits of both PCA and WSVM. After performing dimension reduction operations on the datasets, kernel-based possibilistic c-means algorithm can generate more suitable weights for the training, as PCA transforms the data into a new coordinate system with largest variances affected greatly by the outliers. Experiments on benchmark datasets show promising results, which confirms the effectiveness of the proposed method in terms of prediction accuracy.
Keywords:Subcellular compartment  Support vector machines  Outlier mining  Principal component analysis
本文献已被 ScienceDirect PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号