A divide-and-combine method for large scale nonparallel support vector machines |
| |
Affiliation: | 1. Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China;2. School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 101408, China;3. Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing 100190, China;4. College of Information Science and Technology, University of Nebraska at Omaha, Omaha, NE 68182, USA;1. School of Mathematics and Information Engineering, Chongqing University of Education, Chongqing 400065, PR China;2. School of Mathematical Sciences, Chongqing Normal University, Chongqing 400047, PR China;3. Key Laboratory for Optimization and Control Ministry of Education, Chongqing Normal University, Chongqing 400047, PR China;1. College of Materials Science and Engineering, Chongqing University, Chongqing 400044, PR China;2. School of Materials Science and Energy Engineering, Foshan University, Foshan 528000, Guangdong, PR China;1. Department of Mathematics, and Research Center for Complex Systems and Network Sciences, Southeast University, Nanjing 210996, Jiangsu, China;2. School of Mathematics and Statistics, and Key Laboratory for Nonlinear Science and System Structure, Chongqing Three Gorges University, Wanzhou 404100, Chongqing, China;3. Department of Mathematics, Faculty of Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia;4. Department of Mathematics, Quaid-I-Azam University, Islamabad 44000, Pakistan;1. School of Communication and Information Engineering, University of Electronic Science and Technology of China, Chengdu, China;2. Technique Institute of Physics and Chemistry, Chinese Academy of Sciences, Beijing, China;1. Rudolf Boehm Institute for Pharmacology and Toxicology, University of Leipzig, Leipzig, Germany;2. Faculty of Life Sciences, The University of Manchester, Manchester, UK;3. Autonomic Neuroscience Centre, Royal Free and University College Medical School, London, UK;4. Department of Pharmacology, Institute of Experimental Medicine, Hungarian Academy of Sciences, Budapest, Hungary |
| |
Abstract: | Nonparallel Support Vector Machine (NPSVM) which is more flexible and has better generalization than typical SVM is widely used for classification. Although some methods and toolboxes like SMO and libsvm for NPSVM are used, NPSVM is hard to scale up when facing millions of samples. In this paper, we propose a divide-and-combine method for large scale nonparallel support vector machine (DCNPSVM). In the division step, DCNPSVM divide samples into smaller sub-samples aiming at solving smaller subproblems independently. We theoretically and experimentally prove that the objective function value, solutions, and support vectors solved by DCNPSVM are close to the objective function value, solutions, and support vectors of the whole NPSVM problem. In the combination step, the sub-solutions combined as initial iteration points are used to solve the whole problem by global coordinate descent which converges quickly. In order to balance the accuracy and efficiency, we adopt a multi-level structure which outperforms state-of-the-art methods. Moreover, our DCNPSVM can tackle unbalance problems efficiently by tuning the parameters. Experimental results on lots of large data sets show the effectiveness of our method in memory usage, classification accuracy and time consuming. |
| |
Keywords: | Support vector machine Nonparallel support vector machine Large scale Clustering Divide Combine |
本文献已被 ScienceDirect 等数据库收录! |
|