×

DC algorithm for extended robust support vector machine. (English) Zbl 1474.68251

Summary: Nonconvex variants of support vector machines (SVMs) have been developed for various purposes. For example, robust SVMs attain robustness to outliers by using a nonconvex loss function, while extended \(\nu\)-SVM (E\(\nu\)-SVM) extends the range of the hyperparameter by introducing a nonconvex constraint. Here, we consider an extended robust support vector machine (ER-SVM), a robust variant of E\(\nu\)-SVM. ER-SVM combines two types of nonconvexity from robust SVMs and E\(\nu\)-SVM. Because of the two nonconvexities, the existing algorithm we proposed needs to be divided into two parts depending on whether the hyperparameter value is in the extended range or not. The algorithm also heuristically solves the nonconvex problem in the extended range.
In this letter, we propose a new, efficient algorithm for ER-SVM. The algorithm deals with two types of nonconvexity while never entailing more computations than either E\(\nu\)-SVM or robust SVM, and it finds a critical point of ER-SVM. Furthermore, we show that ER-SVM includes the existing robust SVMs as special cases. Numerical experiments confirm the effectiveness of integrating the two nonconvexities.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
90C26 Nonconvex programming, global optimization
90C59 Approximation methods and heuristics in mathematical programming

Software:

GitHub; LIBSVM; UCI-ml
Full Text: DOI

References:

[1] Artzner, P., Delbaen, F., Eber, J., & Heath, D. (1999). Coherent measures of risk. Mathematical Finance, 9, 203-228. , · Zbl 0980.91042
[2] Blake, C., & Merz, C. (1998). UCI repository of machine learning databases.
[3] Chang, C., & Lin, C. (2001a). Libsvm: A library for support vector machines (Technical report). Department of Computer Science, National Taiwan University.
[4] Chang, C., & Lin, C. (2001b). Training ##img##-support vector classifiers: Theory and algorithms. Neural Computation, 13(9), 2119-2147. , · Zbl 0993.68080
[5] Collobert, R., Sinz, F., Weston, J., & Bottou, L. (2006). Trading convexity for scalability. In Proceedings of the International Conference on Machine Learning (pp. 129-136). New York: ACM. ,
[6] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273-297. , · Zbl 0831.68098
[7] Crisp, D., & Burges, C. (2000). A geometric interpretation of ##img##-svm classifiers. In S. A. Solla, T. K. Leen, & K. Müller (Eds.), Neural information processing systems, 12 (pp. 244-250). Cambridge, MA: MIT Press.
[8] Fujiwara, S., Takeda, A., & Kanamori, T. (2016). Python codes for extended robust support vector machine. https://github.com/sfujiwara/er-svm · Zbl 1474.68251
[9] Horst, R., & Tuy, H. (1996). Global optimization: Deterministic approaches (3rd ed.). Berlin: Springer-Verlag. , · Zbl 0867.90105
[10] Le Thi, H., & Pham Dinh, T. (2005). The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Annals of Operations Research, 133(1-4), 23-46. , · Zbl 1116.90122
[11] Perez-Cruz, F., Weston, J., Hermann, D. J. L., & Schölkopf, B. (2003). Extension of the ##img##-SVM range for classification. In J. A. K. Suykens, I. Basu, S. Micchelli, & J. Vandewalle (Eds.), Advances in learning theory: Methods, models and applications (pp. 179-196). Amsterdam: IOS Press.
[12] Pham Dinh, T. (1988). Duality in DC (difference of convex functions) optimization. Subgradient methods. In K. W. Hoffmann, J. Zoue, J. B. Hiriart-Urruty, & C. Lemarechal (Eds.), Trends in mathematical optimization (pp. 277-293). Basel: Birkhäuser. · Zbl 0634.49009
[13] Pham Dinh, T., & Le Thi, H. (1997). Convex analysis approach to D.C. programming: Theory, algorithms and applications. Acta Mathematica Vietnamica, 22(1), 289-355. · Zbl 0895.90152
[14] Phanm Dinh, T., Le Thi, H., & Muu, L. D. (1999). Exact penalty in D.C. programming. Vietnam Journal of Mathematics, 27(2), 169-178. · Zbl 1006.90062
[15] Platt, J. (1998). Fast training of support vector machines using sequential minimal optimization. Cambridge, MA: MIT Press.
[16] Rifkin, R., Pontil, M., & Verri, A. (1999). A note on support vector machine degeneracy. In O. Watanabe & T. Yokomori (Eds.), Algorithmic learning theory (pp. 252-263). Berlin: Springer-Verlag. , · Zbl 0949.68079
[17] Rockafellar, R. (1970). Convex analysis. Princeton: Princeton University Press. , · Zbl 0193.18401
[18] Rockafellar, R., & Uryasev, S. (2002). Conditional value-at-risk for general loss distributions. Journal of Banking and Finance, 26(7), 1443-1472. ,
[19] Schölkopf, B., Smola, A., Williamson, R., & Bartlett, P. (2000). New support vector algorithms. Neural Computation, 12(5), 1207-1245. ,
[20] Shen, X., Tseng, G., Zhang, X., & Wong, W. (2003). On ##img##-learning. Journal of the American Statistical Association, 98(463), 724-734. , · Zbl 1052.62095
[21] Smola, A., Vishwanathan, S., & Hofmann, T. (2005). Kernel methods for missing variables. In Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (pp. 325-332). Society for Artificial Intelligence and Statistics.
[22] Sriperumbudur, B., & Lanckriet, G. (2012). A proof of convergence of the concave-convex procedure using Zangwill’s theory. Neural Computation, 24(6), 1391-1407. , · Zbl 1254.90180
[23] Takeda, A., Fujiwara, S., & Kanamori, T. (2014). Extended robust support vector machine based on financial risk minimization. Neural Computation, 26(11), 2541-2569. , · Zbl 1415.68180
[24] Takeda, A., & Sugiyama, M. (2008). ##img##-support vector machine as conditional value-at-risk minimization. In Proceedings of the International Conference on Machine Learning (pp. 1056-1063). New York: ACM. ,
[25] Tsyurmasto, P., Uryasev, S., & Gotoh, J. (2013). Support vector classification with positive homogeneous risk functionals (Technical report).
[26] Wozabal, D. (2012). Value-at-risk optimization using the difference of convex algorithm. OR Spectrum, 34, 861-883. , · Zbl 1282.91313
[27] Wu, Y., & Liu, Y. (2007). Robust truncated hinge loss support vector machines. Journal of the American Statistical Association, 102(479), 974-983. , · Zbl 1469.62293
[28] Xu, H., Caramanis, C., & Mannor, S. (2009). Robustness and regularization of support vector machines. Journal of Machine Learning Research, 10, 1485-1510. · Zbl 1235.68209
[29] Xu, L., Crammer, K., & Schuurmans, D. (2006). Robust support vector machine training via convex outlier ablation. In Proceedings of the 21st National Conference on Artificial Intelligence (pp. 536-542). Cambridge, MA: AAAI Press.
[30] Yu, Y., Yang, M., Xu, L., White, M., & Schuurmans, D. (2010). Relaxed clipping: A global training method for robust regression and classification. In J. K. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, & A. Culotta (Eds.), Neural information processing systems, 23 (pp. 2532-2540). Cambridge, MA: MIT Press.
[31] Yuille, A., & Rangarajan, A. (2003). The concave-convex procedure. Neural Computation, 15(4), 915-936. , · Zbl 1022.68112
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.