×

Supervised dimensionality reduction via sequential semidefinite programming. (English) Zbl 1162.68630

Summary: Many dimensionality reduction problems end up with a trace quotient formulation. Since it is difficult to directly solve the trace quotient problem, traditionally the trace quotient cost function is replaced by an approximation such that the generalized eigenvalue decomposition can be applied. In contrast, we directly optimize the trace quotient in this work. It is reformulated as a quasi-linear semidefinite optimization problem, which can be solved globally and efficiently using standard off-the-shelf semidefinite programming solvers. Also this optimization strategy allows one to enforce additional constraints (for example, sparseness constraints) on the projection matrix. We apply this optimization framework to a novel dimensionality reduction algorithm. The performance of the proposed algorithm is demonstrated in experiments by several UCI machine learning benchmark examples, USPS handwritten digits as well as ORL and Yale face data.

MSC:

68T10 Pattern recognition, speech recognition
68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI

References:

[1] Yan, S.; Tang, X., Trace quotient problems revisited, (Proceedings of the European Conference on Computer Vision (2006), Springer: Springer Berlin), 232-244
[2] Cai, D.; He, X.; Han, J.; Zhang, H.-J., Orthogonal Laplacianfaces for face recognition, IEEE Trans. Image Process., 15, 11, 3608-3614 (2006)
[3] G. Hua, P.A. Viola, S.M. Drucker, Face recognition using discriminatively trained orthogonal rank one tensor projections, in: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition, Minneapolis, MN, 2007.; G. Hua, P.A. Viola, S.M. Drucker, Face recognition using discriminatively trained orthogonal rank one tensor projections, in: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition, Minneapolis, MN, 2007.
[4] J. Ye, T. Xiong, Null space versus orthogonal linear discriminant analysis, in: Proceedings of the International Conference on Machine Learning, Pittsburgh, PA, 2006, pp. 1073-1080.; J. Ye, T. Xiong, Null space versus orthogonal linear discriminant analysis, in: Proceedings of the International Conference on Machine Learning, Pittsburgh, PA, 2006, pp. 1073-1080.
[5] Yang, J.; Yang, J.-Y.; Zhang, D., What’s wrong with fisher criterion?, Pattern Recognition, 35, 11, 2665-2668 (2002) · Zbl 1006.68924
[6] Boyd, S.; Vandenberghe, L., Convex Optimization (2004), Cambridge University Press: Cambridge University Press Cambridge · Zbl 1058.90049
[7] A. Ben-Tal, A.S. Nemirovski, Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications, SIAM, Philadelphia, PA, USA, 2001.; A. Ben-Tal, A.S. Nemirovski, Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications, SIAM, Philadelphia, PA, USA, 2001. · Zbl 0986.90032
[8] Weinberger, K. Q.; Saul, L. K., Unsupervised learning of image manifolds by semidefinite programming, Int. J. Comput. Vision, 70, 1, 77-90 (2006)
[9] Keuchel, J.; Schnörr, C.; Schellewald, C.; Cremers, D., Binary partitioning, perceptual grouping, and restoration with semidefinite programming, IEEE Trans. Pattern Anal. Mach. Intell., 25, 11, 1364-1379 (2003)
[10] Lanckriet, G. R.G.; Cristianini, N.; Bartlett, P.; Ghaoui, L. E.; Jordan, M. I., Learning the kernel matrix with semidefinite programming, J. Mach. Learn. Res., 5, 27-72 (2004) · Zbl 1222.68241
[11] Overton, M. L.; Womersley, R. S., On the sum of the largest eigenvalues of a symmetric matrix, SIAM J. Matrix Anal. Appl., 13, 1, 41-45 (1992) · Zbl 0747.15005
[12] Overton, M. L.; Womersley, R. S., Optimality conditions and duality theory for minimizing sums of the largest eigenvalues of symmetric matrices, Math. Program., 62, 1-3, 321-357 (1993) · Zbl 0806.90114
[13] S. Agarwal, M. Chandraker, F. Kahl, S. Belongie, D.J. Kriegman, Practical global optimization for multiview geometry, in: Proceedings of the European Conference on Computer Vision, vol. 1, Graz, Austria, 2006, pp. 592-605.; S. Agarwal, M. Chandraker, F. Kahl, S. Belongie, D.J. Kriegman, Practical global optimization for multiview geometry, in: Proceedings of the European Conference on Computer Vision, vol. 1, Graz, Austria, 2006, pp. 592-605.
[14] Borchers, B., CSDP, a C library for semidefinite programming, Optim. Methods Software, 11, 1, 613-623 (1999) · Zbl 0973.90524
[15] Sturm, J. F., Using SeDuMi 1.02, a matlab toolbox for optimization over symmetric cones (updated for version 1.05), Optim. Methods Software, 11-12, 625-653 (1999) · Zbl 0973.90526
[16] Schaible, S., Fractional programming. II on Dinkelbach’s algorithm, Manage. Sci., 22, 8, 868-873 (1976) · Zbl 0346.90052
[17] Yan, S.; Xu, D.; Zhang, B.; Zhang, H.; Yang, Q.; Lin, S., Graph embedding and extensions: a general framework for dimensionality reduction, IEEE Trans. Pattern Anal. Mach. Intell., 29, 1, 40-51 (2007)
[18] R. Gilad-Bachrach, A. Navot, N. Tishby, Margin based feature selection-theory and algorithms, in: Proceedings of the International Conference on Machine Learning, Banff, Alberta, Canada, 2004.; R. Gilad-Bachrach, A. Navot, N. Tishby, Margin based feature selection-theory and algorithms, in: Proceedings of the International Conference on Machine Learning, Banff, Alberta, Canada, 2004. · Zbl 1078.68120
[19] R. Fransens, J. De Prins, L. Van Gool, SVM-based nonparametric discriminant analysis, an application to face detection, in: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, Nice, France, 2003, pp. 1289-1296.; R. Fransens, J. De Prins, L. Van Gool, SVM-based nonparametric discriminant analysis, an application to face detection, in: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, Nice, France, 2003, pp. 1289-1296.
[20] Li, H.; Jiang, T.; Zhang, K., Efficient and robust feature extraction by maximum margin criterion, IEEE Trans. Neural Network, 17, 1, 157-165 (2006)
[21] Xing, E.; Ng, A.; Jordan, M.; Russell, S., Distance metric learning with, application to clustering with side-information, (Proceedings on Advances in Neural Information Processing System (2002), MIT Press: MIT Press Cambridge, MA)
[22] K.Q. Weinberger, J. Blitzer, L.K. Saul, Distance metric learning for large margin nearest neighbor classification, in: Proceedings on Advances in Neural Information Processing System, 2005.; K.Q. Weinberger, J. Blitzer, L.K. Saul, Distance metric learning for large margin nearest neighbor classification, in: Proceedings on Advances in Neural Information Processing System, 2005.
[23] A. Globerson, S. Roweis, Metric learning by collapsing classes, in: Proceedings on Advances in Neural Information Processing System, 2005.; A. Globerson, S. Roweis, Metric learning by collapsing classes, in: Proceedings on Advances in Neural Information Processing System, 2005.
[24] Goldberger, J.; Roweis, S.; Hinton, G.; Salakhutdinov, R., Neighbourhood component analysis, (Proceedings on Advances in Neural Information Processing System (2004), MIT Press: MIT Press Cambridge, MA)
[25] Belkin, M.; Niyogi, P., Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput., 15, 6, 1373-1396 (2003) · Zbl 1085.68119
[26] D. Newman, S. Hettich, C. Blake, C. Merz, UCI repository of machine learning databases (1998).; D. Newman, S. Hettich, C. Blake, C. Merz, UCI repository of machine learning databases (1998).
[27] Jin, Z.; Yang, J.-Y.; Hu, Z.-S.; Lou, Z., Face recognition based on the uncorrelated discriminant transformation, Pattern Recognition, 34, 7, 1405-1416 (2001) · Zbl 0978.68118
[28] M. Heiler, C. Schnorr, Learning non-negative sparse image codes by convex programming, in: Proceedings of the IEEE International Conference on Computer Vision, Beijing, China, 2005, pp. 1667-1674.; M. Heiler, C. Schnorr, Learning non-negative sparse image codes by convex programming, in: Proceedings of the IEEE International Conference on Computer Vision, Beijing, China, 2005, pp. 1667-1674.
[29] M. Heiler, C. Schnörr, Controlling sparseness in non-negative tensor factorization, in: Proceedings of the European Conference on Computer Vision, vol. 1, Graz, Austria, 2006, pp. 56-67.; M. Heiler, C. Schnörr, Controlling sparseness in non-negative tensor factorization, in: Proceedings of the European Conference on Computer Vision, vol. 1, Graz, Austria, 2006, pp. 56-67.
[30] Hoyer, P. O., Non-negative matrix factorization with sparseness constraints, J. Mach. Learn. Res., 5, 1457-1469 (2004) · Zbl 1222.68218
[31] d’Aspremont, A.; Ghaoui, L. E.; Jordan, M. I.; Lanckriet, G. R.G., A direct formulation for sparse PCA using semidefinite programming, SIAM Rev., 49, 434-448 (2007) · Zbl 1128.90050
[32] Cadima, J.; Jolliffe, I., Loadings and correlations in the interpretation of principal components, Appl. Statist., 22, 203-214 (1995)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.