×

A stable hyperparameter selection for the Gaussian RBF kernel for discrimination. (English) Zbl 07260239

Summary: Kernel-based classification methods, for example, support vector machines, map the data into a higher-dimensional space via a kernel function. In practice, choosing the value of hyperparameter in the kernel function is crucial in order to ensure good performance. We propose a method of selecting the hyperparameter in the Gaussian radial basis function (RBF) kernel by considering the geometry of the embedded feature space. This method is independent of the choice of the discrimination algorithm and also computationally efficient. Its classification performance is competitive with existing methods including cross-validation. Using simulated and real-data examples, we show that the proposed method is stable with respect to sampling variability.

MSC:

62-XX Statistics
68-XX Computer science
Full Text: DOI

References:

[1] M. Aizerman, E. Braverman, and L. Rozonoer, Theoretical foundations of the potential function method in pattern recognition learning. Autom Rem Contr, 25 (1964), 821-837. · Zbl 0151.24701
[2] J. Mercer, Functions of positive and negative type and their connection with the theory of integral equations, Philos Trans R Soc Lond A 209 (1909), 415-446. · JFM 40.0408.02
[3] B. Sch¨olkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond (Adaptive Computation and Machine Learning), Cambridge, MA, The MIT Press, 2001.
[4] J. Shawe-Taylor and N. Cristianini,Kernel Methods for Pattern Analysis, New York, NY, Cambridge University Press, 2004. · Zbl 0994.68074
[5] T. Hofmann, B. Sch¨olkopf, and A. J. Smola, Kernel methods in machine learning, Ann Stat, 36 (2008), 1171-1220. · Zbl 1151.30007
[6] V. Vapnik,Statistical Learning Theory, New York, NY, Wiley, 1998. · Zbl 0935.62007
[7] S. Mika, G. R¨atsch, J. Weston, B. Sch¨olkopf, and K.-R. M¨uller, Fisher discriminant analysis with kernels, In Neural Networks for Signal Processing IX, Y.-H. Hu, J. Larsen, E. Wilson, and S. Douglas, eds. IEEE, 1999, 41-48.
[8] W. Wang, Z. Xu, W. Lu, and X. Zhang. Determination of the spread parameter in the Gaussian kernel for classification and regression, Neurocomputing 55 (2003), 643-663.
[9] M. P. Brown, W. N. Grundy, D. Lin, N. Cristianini, C. W. Sugnet, T. S. Furey, M. Ares Jr, and D. Haussler, Knowledge-based analysis of microarray expression data by using support vector machines, Proc Natl Acad Sci, USA 97 (2000), 262-267.
[10] T. Joachims, Estimating the generalization performance of a SVM efficiently. Proceedings of the International Conference on Machine Learning, San Francisco, CA, Morgan Kaufman, 2000.
[11] G. Wahba, Y. Lin, and H. Zhang, Generalized approximate cross validation for support vector machines, or, another way to look at margin-like quantities, In Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Sch¨olkopf and D. Schhrmans, eds., Cambridge, MA, MIT Press, 2000, 297-309.
[12] V. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, Choosing kernel parameters for support vector machines, Mach Learn 46 (2002), 131-160. · Zbl 0998.68101
[13] S. S. Keerthi, Efficient tuning of SVM hyperparameters using Radius/Margin bound and iterative algorithms, IEEE TNN 11(5) (2002), 1225-1229.
[14] K. Duan, S. S. Keerthi, and A. N. Poo, Evaluation of simple performance measures for tuning SVM parameters, Neurocomputing 51 (2003), 41-59.
[15] G. R¨atsch. Benchmark datasets. Available: http://ida.first. fraunhofer.de/projects/bench/benchmarks.htm.[Last accessed 1999].
[16] S. S. Keerthi and C. J. Lin, Asymptotic behaviours of support vector machines with Gaussian kernel, Neural Comput 15 (2003), 1667-1689. · Zbl 1086.68569
[17] B. U. Park and J. S. Marron, Comparison of data-driven bandwidth selectors, J Am Stat Assoc 85(409) (1990), 66-72.
[18] P. Hall and K. H. Kang, Bandwidth choice for nonparametric classification, Ann Stat 33(1) (2005), 284-306. · Zbl 1064.62075
[19] F.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.