×

Trace optimization and eigenproblems in dimension reduction methods. (English) Zbl 1249.65075

Summary: This paper gives an overview of the eigenvalue problems encountered in areas of data mining that are related to dimension reduction. Given some input high-dimensional data, the goal of dimension reduction is to map them to a low-dimensional space such that certain properties of the original data are preserved. Optimizing these properties among the reduced data can be typically posed as a trace optimization problem that leads to an eigenvalue problem. There is a rich variety of such problems and the goal of this paper is to unravel relationships between them as well as to discuss effective solution techniques. First, we make a distinction between projective methods that determine an explicit linear mapping from the high-dimensional space to the low-dimensional space, and nonlinear methods where the mapping between the two is nonlinear and implicit. Then, we show that all the eigenvalue problems solved in the context of explicit linear projections can be viewed as the projected analogues of the nonlinear or implicit projections. We also discuss kernels as a means of unifying linear and nonlinear methods and revisit some of the equivalences between methods established in this way. Finally, we provide some illustrative examples to showcase the behavior and the particular characteristics of the various dimension reduction techniques on real-world data sets.

MSC:

65F15 Numerical computation of eigenvalues and eigenvectors of matrices
90C90 Applications of mathematical programming

Software:

AR face; DIFFRAC
Full Text: DOI

References:

[1] Webb, Statistical Pattern Recognition (2002) · doi:10.1002/0470854774
[2] Koren, LNCS, in: COCOON 03 pp 496– (2003)
[3] Noack, Proceedings of the 11th International Symposium on Graph Drawing (GD 2003), LNCS 2912 pp 425– (2004)
[4] Curtarolo, Predicting crystal structures with data mining of quantum calculations, Physical Review Letters 91 (13) pp 135503– (2003) · doi:10.1103/PhysRevLett.91.135503
[5] Ceder, Data-mining-driven quantum mechanics for the prediction of structure, MRS Bulletin 31 pp 981– (2006) · doi:10.1557/mrs2006.224
[6] Sebastian, Dimensional reduction at a quantum critical point, Nature 441 pp 617– (2006) · doi:10.1038/nature04732
[7] http://www.cs.toronto.edu/roweis/data.html
[8] Bishop, Information Science and Statistics (2006)
[9] Roweis, Nonlinear dimensionality reduction by locally linear embedding, Science 290 pp 2323– (2000) · doi:10.1126/science.290.5500.2323
[10] John Lee, Nonlinear Dimensionality Reduction. Information Science and Statistics (2007)
[11] Belkin, Laplacian eigenmaps for dimensionality reduction and data representation, Neural Computation 15 (6) pp 1373– (2003) · Zbl 1085.68119 · doi:10.1162/089976603321780317
[12] Zhang, Principal manifolds and nonlinear dimensionality reduction via tangent space alignment, SIAM Journal on Scientific Computing 26 (1) pp 313– (2005) · Zbl 1077.65042 · doi:10.1137/S1064827502419154
[13] Tenenbaum, A global geometric framework for nonlinear dimensionality reduction, Science 290 pp 2319– (2000) · doi:10.1126/science.290.5500.2319
[14] Weinberger KQ Saul LK Unsupervised learning of image manifolds by semidefinite programming
[15] Sha F Saul LK Analysis and extension of spectral methods for nonlinear dimensionality reduction
[16] Kokiopoulou, IEEE 5th International Conference on Data Mining (ICDM05) pp 234– (2005) · doi:10.1109/ICDM.2005.113
[17] Ham, ICML ’04: Proceedings of the 21st International Conference on Machine Learning pp 47– (2004)
[18] Williams, On a connection between kernel PCA and metric multidimensional scaling, Machine Learning 46 (1) pp 11– (2002) · Zbl 1052.68118 · doi:10.1023/A:1012485807823
[19] Parlett, The Symmetric Eigenvalue Problem. Number 20 in Classics in Applied Mathematics (1998) · Zbl 0885.65039 · doi:10.1137/1.9781611971163
[20] Saad, Numerical Methods for Large Eigenvalue Problems (1992) · Zbl 1242.65068
[21] Guo, A generalized Foley-Sammon transform based on generalized Fisher discriminant criterion and its application to face recognition, Pattern Recognition Letters 24 (1-3) pp 147– (2003) · Zbl 1055.68092 · doi:10.1016/S0167-8655(02)00207-6
[22] Wang, Trace ratio vs. ratio trace for dimensionality reduction, IEEE Conference on Computer Vision and Pattern Recognition pp 17– (2007)
[23] Xiang, Learning a mahalanobis distance metric for data clustering and classification, Pattern Recognition 41 (12) pp 3600– (2008) · Zbl 1162.68642 · doi:10.1016/j.patcog.2008.05.018
[24] Yan, Lecture Notes in Computer Science, in: Proceedings of the European Conference on Computer Vision pp 232– (2006) · doi:10.1007/11744047_18
[25] Nie, Semi-supervised orthogonal discriminant analysis via label propagation, Pattern Recognition 42 pp 2615– (2009) · Zbl 1175.68338 · doi:10.1016/j.patcog.2009.04.001
[26] Saul, Think globally, fit locally: unsupervised learning of nonlinear manifolds, Journal of Machine Learning Research 4 pp 119– (2003) · Zbl 1093.68089 · doi:10.1162/153244304322972667
[27] Belkin, Advances in Neural Information Processing Systems 14 pp 585– (2001)
[28] He X Niyogi P Locality preserving projections
[29] Kokiopoulou, Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique, IEEE TPAMI 29 pp 2143– (2007) · doi:10.1109/TPAMI.2007.1131
[30] Jolliffe, Principal Component Analysis (1986) · doi:10.1007/978-1-4757-1904-8
[31] Torgerson, Multidimensional scaling: I. Theory and method, Psychometrika 17 (4) pp 401– (1952) · Zbl 0049.37603 · doi:10.1007/BF02288916
[32] Cai, Orthogonal Laplacian faces for face recognition, IEEE Transactions on Image Processing 15 (11) pp 3608– (2006) · doi:10.1109/TIP.2006.881945
[33] Howland, Generalizing discriminant analysis using the generalized singular value decomposition, IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (8) pp 995– (2004) · doi:10.1109/TPAMI.2004.46
[34] Zhang, Semi-supervised dimensionality reduction, SIAM Data Mining pp 629– (2007)
[35] Cai D He X Han J Semi-supervised discriminant analysis
[36] Song, A unified framework for semi-supervised dimensionality reduction, Pattern Recognition 41 pp 2789– (2008) · Zbl 1154.68501 · doi:10.1016/j.patcog.2008.01.001
[37] Zhang Y Yeung D-Y Semi-supervised discriminant analysis using robust path-based similarity
[38] Zhang, Semi-supervised discriminant analysis via CCCP, ECML/PKDD pp 644– (2008)
[39] Yang X Fu H Zha H Barlow J Semi-supervised nonlinear dimensionality reduction
[40] Zhang Z Zha H Zhang M Spectral methods for semi-supervised manifold learning
[41] Shi, Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis Machine Intelligence 22 (8) pp 888– (2000) · doi:10.1109/34.868688
[42] Ng, On spectral clustering: analysis and an algorithm, Advances in Neural Information Processing Systems 14 (2002)
[43] Ding C Spectral clustering
[44] Luxburg, A tutorial on spectral clustering, Statistics and Computing 17 (4) pp 395– (2007) · doi:10.1007/s11222-007-9033-z
[45] Hagen, New spectral methods for ratio cut partitioning and clustering, IEEE Transactions on Computer-Aided Design of Integrated Circuits Systems 11 (9) pp 1074– (1992) · doi:10.1109/43.159993
[46] Fiedler, Algebraic connectivity of graphs, Czechoslovak Mathematical Journal 23 pp 298– (1973) · Zbl 0265.05119
[47] Fiedler, A property of eigenvectors of nonnegative symmetric matrices and its applications to graph theory, Czechoslovak Mathematical Journal 25 pp 619– (1975) · Zbl 0437.15004
[48] Müller, An introduction to kernel-based learning algorithms, IEEE Transactions on Neural Networks 12 pp 181– (2001) · doi:10.1109/72.914517
[49] Vapnik, Statistical Learning Theory (1998)
[50] Schlköpf, Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond (2001)
[51] Shawe-Taylor, Kernel Methods for Pattern Analysis (2004) · doi:10.1017/CBO9780511809682
[52] Aronszajn, Theory of reproducing kernels, Transactions of the American Mathematical Society 68 pp 337– (1950) · Zbl 0037.20701 · doi:10.1090/S0002-9947-1950-0051437-7
[53] Schölkopf, Nonlinear component analysis as a kernel eigenvalue problem, Neural Computation 10 pp 1299– (1998) · doi:10.1162/089976698300017467
[54] Fouss, Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation, IEEE Transactions on Knowledge and Data Engineering 19 (3) pp 355– (2007) · doi:10.1109/TKDE.2007.46
[55] Bengio, Advances in Neural Information Processing Systems 16 (2004)
[56] Boley, Principal direction divisive partitioning, Data Mining and Knowledge Discovery 2 (4) pp 325– (1998) · doi:10.1023/A:1009740529316
[57] Schölkopf, Learning with Kernels (2002)
[58] Asuncion A Newman DJ http://www.ics.uci.edu/mlearn/MLRepository.html
[59] Graham, Characterizing virtual eigensignatures for general purpose face recognition, Face Recognition: From Theory to Applications 163 pp 446– (1998) · doi:10.1007/978-3-642-72201-1_25
[60] Samaria F Harter A Parameterisation of a stochastic model for human face identification
[61] Martinez AM Benavente R The AR face database 1998
[62] Bennett, The interplay of optimization and machine learning research, Journal of Machine Learning Research 7 pp 1265– (2006) · Zbl 1222.68146
[63] Cortes, Support-vector networks, Machine Learning 20 (3) pp 273– (1995) · Zbl 0831.68098 · doi:10.1007/BF00994018
[64] Weinberger, Unsupervised learning of image manifolds by semidefinite programming, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR-04) 2 pp 988– (2004)
[65] Weinberger, AAAI’06: Proceedings of the 21st National Conference on Artificial Intelligence pp 1683– (2006)
[66] Xu, Maximum margin clustering, Advances in Neural Information Processing Systems 17 (2005)
[67] Xu L Schuurmans D Unsupervised and semi-supervised multi-class support vector machines
[68] Bach, Diffrac: a discriminative and flexible framework for clustering, Advances in Neural Information Processing Systems 20 (2008)
[69] Weinberger K Packer B Saul L Nonlinear dimensionality reduction by semidefinite programming and kernel matrix factorization
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.