×

Nonlinear dimensionality reduction using a temporal coherence principle. (English) Zbl 1242.68228

Summary: Temporal coherence principle is an attractive biologically inspired learning rule to extract slowly varying features from quickly varying input data. In this paper we develop a new nonlinear neighborhood preserving (NNP) technique, by utilizing the temporal coherence principle to find an optimal low dimensional representation from the original high dimensional data. NNP is based on a nonlinear expansion of the original input data, such as polynomials of a given degree. It can be solved by the eigenvalue problem without using gradient descent and is guaranteed to find the global optimum. NNP can be viewed as a nonlinear dimensionality reduction framework which takes into consideration both time series and data sets without an obvious temporal structure. According to different situations, we introduce three algorithms of NNP, named NNP-1, NNP-2, and NNP-3. The objective function of NNP-1 is equal to slow feature analysis (SFA), and it works well for time series such as image sequences. NNP-2 artificially constructs time series consisting of neighboring points for data sets without a clear temporal structure such as image data. NNP-3 is proposed for classification tasks, which can minimize the distances of neighboring points in the embedding space and ensure that the remaining points are as far apart as possible simultaneously. Furthermore, the kernel extension of NNP is also discussed in this paper. The proposed algorithms work very well on some image sequences and image data sets compared to other methods. Meanwhile, we perform the classification task on the MNIST handwritten digit database using the supervised NNP algorithms. The experimental results demonstrate that NNP is an effective technique for nonlinear dimensionality reduction tasks.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
68T10 Pattern recognition, speech recognition

Software:

MNIST; Bubbles; R; sfa
Full Text: DOI

References:

[1] O. Arandjelovic, G. Shakhnarovich, J. Fisher, R. Cipolla, T. Darrell, Face recognition with image sets using manifold density divergence, in: IEEE Conference on Computer Vision and Pattern Recognition, vol. 15, 2005, pp. 581-588.; O. Arandjelovic, G. Shakhnarovich, J. Fisher, R. Cipolla, T. Darrell, Face recognition with image sets using manifold density divergence, in: IEEE Conference on Computer Vision and Pattern Recognition, vol. 15, 2005, pp. 581-588.
[2] Balasubramanian, M.; Schwartz, E. L., The isomap algorithm and topological stability, Science, 295, 5552, 9 (2002)
[3] Belkin, M.; Niyogi, P., Laplacian Eigenmaps and spectral techniques for embedding and clustering, Advances in Neural Information Processing Systems, 14, 585-591 (2001)
[4] Belkin, M.; Niyogi, P., Laplacian Eigenmaps for dimensionality reduction and data representation, Neural Computation, 15, 6, 1373-1396 (2003) · Zbl 1085.68119
[5] Belkin, M.; Niyogi, P.; Sindhwani, V., Manifold regularization: a geometric framework for learning from labeled and unlabeled examples, Journal of Machine Learning Research, 6, 2399-2424 (2006) · Zbl 1222.68144
[6] P. Berkes, Temporal Slowness as an Unsupervised Learning Principle: Self-organization of Complex-cell Receptive Fields and Application to Pattern Recognition, Ph.D. Thesis, Institute for Theoretical Biology, Humboldt University, Berlin, 2005.; P. Berkes, Temporal Slowness as an Unsupervised Learning Principle: Self-organization of Complex-cell Receptive Fields and Application to Pattern Recognition, Ph.D. Thesis, Institute for Theoretical Biology, Humboldt University, Berlin, 2005.
[7] Berkes, P., Pattern recognition with slow feature analysis, Cognitive Sciences EPrint Archive (CogPrint), 4104 (2005)
[8] Chelidze, D.; Zhou, W., Smooth orthogonal decomposition based modal analysis, Journal of Sound and Vibration, 292, 3-5, 461-473 (2006)
[9] Chelidze, D.; Liu, M., Reconstructing slow-time dynamics from fasttime measurements, Philosophical Transaction of the Royal Society A, 366, 729-7445 (2008) · Zbl 1153.74327
[10] Chelidze, D.; Cusumano, J. P., Phase space warping: nonlinear time series analysis for slowly drifting systems, Philosophical Transactions of the Royal Society A, 364, 2495-2513 (2006) · Zbl 1152.37346
[11] Coifman, R. R.; Lafon, S.; Lee, A. B.; Maggioni, M.; Nadler, B.; Warner, F.; Zucker, S. W., Geometric diffusions as a tool for harmonic analysis and structure definition of data. Part I: Diffusion maps, Proceedings of the National Academy of Sciences, 102, 21, 7426-7431 (2005) · Zbl 1405.42043
[12] Demartines, P.; Hérault, J., Curvilinear component analysis: a selforganizing neural network for nonlinear mapping of data sets, IEEE Transactions on Neural Networks, 8, 1, 148-154 (1997)
[13] D.L. Donoho, C. Grimes, Hessian Eigenmaps: New Locally Linear Embedding Techniques for High-dimensional Data, Technical Report TR-2003-08, Department of Statistics, Stanford University, 2003.; D.L. Donoho, C. Grimes, Hessian Eigenmaps: New Locally Linear Embedding Techniques for High-dimensional Data, Technical Report TR-2003-08, Department of Statistics, Stanford University, 2003. · Zbl 1130.62337
[14] Einhäuser, W.; Hipp, J.; Eggert, J.; Körner, E.; König, P., Learning viewpoint invariant object representation using a temporal coherence principle, Biological Cybernetics, 93, 79-90 (2005) · Zbl 1123.91365
[15] Földiák, P., Learning invariance from transformation sequences, Neural Computation, 3, 194-200 (1991)
[16] Geng, X.; Zhan, D. C.; Zhou, Z. H., Supervised nonlinear dimensionality reduction for visualization and classification, IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics, 35, 6, 1098-1107 (2005)
[17] A. Guréin-Dugué, P. Teissier, G. Delso-Gafaro, J. Hérault, Curvilinear component analysis for high-dimensional data representation: II. Examples of introducing additional mapping constraints for specific applications, in: International Work-Conference on Artificial and Natural Neural Networks, Alicante, Spain, 1999.; A. Guréin-Dugué, P. Teissier, G. Delso-Gafaro, J. Hérault, Curvilinear component analysis for high-dimensional data representation: II. Examples of introducing additional mapping constraints for specific applications, in: International Work-Conference on Artificial and Natural Neural Networks, Alicante, Spain, 1999.
[18] He, X. F.; Niyogi, P., Locality preserving projections, Advances in Neural Information Processing Systems, 16 (2003)
[19] Hinton, G., Connectionist learning procedures, Artificial Intelligence, 40, 185-234 (1989)
[20] Hurri, J.; Hyvärinen, A., Simple-cell-like receptive fields maximize temporal coherence in natural video, Neural Computation, 15, 3, 663-691 (2003) · Zbl 1046.92015
[21] Hyvärinen, A.; Hurri, J.; Väyrynen, J., Bubbles: a unifying framework for low-level statistical properties of natural image sequences, Optical Society of America, 20, 7, 1237-1252 (2003)
[22] C. Kayser, W. Einhäuser, O. Dümmer, P. König, K. Körding, Extracting slow subspaces from natural videos leads to complex cells, in: International Conference on Artificial Neural Networks, Vienna, Austria, 2001, pp. 1075-1080.; C. Kayser, W. Einhäuser, O. Dümmer, P. König, K. Körding, Extracting slow subspaces from natural videos leads to complex cells, in: International Conference on Artificial Neural Networks, Vienna, Austria, 2001, pp. 1075-1080. · Zbl 1001.68682
[23] Keysers, D.; Deselaers, T.; Gollan, C.; Ney, H., Deformation models for image recognition, IEEE Transaction on Pattern Analysis and Machine Intelligence, 29, 8, 1422-1435 (2007)
[24] Kohonen, T., Self-organized formation of topologically correct feature maps, Biological Cybernetics, 43, 59-69 (1982) · Zbl 0466.92002
[25] Körding, K.; Kayser, C.; Einhäuser, W.; König, P., How are complex cell properties adapted to the statistics of natural scenes?, Journal of Neurophysiology, 91, 1, 206-212 (2004)
[26] Lauer, F.; Suen, C. Y.; Bloch, G., A trainable feature extractor for handwritten digit recognition, Pattern Recognition, 40, 6, 1816-1824 (2007) · Zbl 1114.68515
[27] LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P., Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86, 11, 2278-2324 (1998)
[28] Li, J. B.; Pan, J. S.; Chu, S. C., Kernel class-wise locality preserving projection, Information Sciences, 178, 7, 1825-1835 (2008) · Zbl 1133.68398
[29] De Maesschalck, R.; Jouan-Rimbaud, D.; Massart, D. L., The mahalanobis distance, Chemometrics and Intelligent Laboratory Systems, 50, 1, 4, 1-18 (2000)
[30] McFarland, J. L.; Fuchs, A. F., Discharge patterns in nucleus prepositus hypoglossi and adjacent medial vestibular nucleus during horizontal eye movement in behaving macaques, Journal of Neurophysiology, 68, 7, 319-332 (1992)
[31] Mitchison, G., Removing time variation with the anti-Hebbian differential synapse, Neural Computation, 3, 312-320 (1991)
[32] A. Mojsilovic, B.E. Rogowitz, Capturing image semantics with low-level descriptors, in: International Conference on Image Processing, I, Thessaloniki, Greece, 2001, pp. 18-21.; A. Mojsilovic, B.E. Rogowitz, Capturing image semantics with low-level descriptors, in: International Conference on Image Processing, I, Thessaloniki, Greece, 2001, pp. 18-21.
[33] Nadler, B.; Lafon, S.; Coifman, R. R.; Kevrekidis, I. G., Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators, Advances in Neural Information Processing Systems, 18 (2005)
[34] Ranzato, M.; Poultney, C. S.; Chopra, S.; LeCun, Y., Efficient learning of sparse representations with an energy-based model, Advances in Neural Information Processing Systems, 19 (2006)
[35] Roweis, S. T.; Saul, L. K., Nonlinear dimensionality analysis by locally linear embedding, Science, 290, 12, 2323-2326 (2000)
[36] S.A. Sarcia, G. Cantone, V.R. Basili, Adopting curvilinear component analysis to improve software cost estimation accuracy: model, application strategy, and an experimental verification, in: Evaluation and Assessment in Software Engineering, University of Bari, Italy, 2008.; S.A. Sarcia, G. Cantone, V.R. Basili, Adopting curvilinear component analysis to improve software cost estimation accuracy: model, application strategy, and an experimental verification, in: Evaluation and Assessment in Software Engineering, University of Bari, Italy, 2008.
[37] Schölkopf, B.; Smola, A. J.; Müller, K.-R., Nonlinear component analysis as a kernel eigenvalue problem, Neural Computation, 10, 5, 1299-1319 (1998)
[38] Seung, H. S.; Lee, D. D., The manifold ways of perception, Science, 290, 12, 2268-2269 (2000)
[39] P.Y. Simard, D. Steinkraus, J.C. Platt, Best practices for convolutional neural networks applied to visual document analysis, in: International Conference on Document Analysis and Recognition, vol. 2, 2003, pp. 958-962.; P.Y. Simard, D. Steinkraus, J.C. Platt, Best practices for convolutional neural networks applied to visual document analysis, in: International Conference on Document Analysis and Recognition, vol. 2, 2003, pp. 958-962.
[40] Song, Y. Q.; Nie, F. P.; Zhang, C. S.; Xiang, S. M., A unified framework for semi-supervised dimensionality reduction, Pattern Recognition, 41, 9, 2789-2799 (2008) · Zbl 1154.68501
[41] Stone, J. V.; Bray, A. J., A learning rule for extracting spatio-temporal invariances, Network: Computation in Neural Systems, 6, 3, 429-436 (1995) · Zbl 0828.92008
[42] Taube, J. S., Head direction cells and the neurophysiological basis for a sense of direction, Progress in Neurobiology, 55, 3, 225-256 (1998)
[43] Teissier, P.; Guréin-Dugué, A. A.; Schwartz, J. L., Models for audiovisual fusion in a noisy-vowel recognition task, Journal of VLSI Signal Processing, 20, 25-44 (1998)
[44] Tenenbaum, J. B.; de Silva, V.; Langford, J. C., A global geometric framework for nonlinear dimensionality reduction, Science, 290, 12, 2319-2323 (2000)
[45] Vathy-Fogarassy, A.; Abonyib, Janos, Local and global mappings of topology representing networks, Information Sciences, 179, 21, 3791-3803 (2009)
[46] Wang, J.; Zhang, Z. Y.; Zha, H. Y., Adaptive manifold learning, Advances in Neural Information Processing Systems, 17 (2005)
[47] K.Q. Weinberger, F. Sha, L.K. Saul, Learning a kernel matrix for nonlinear dimensionality reduction, in: International Conference on Machine Learning, 2004.; K.Q. Weinberger, F. Sha, L.K. Saul, Learning a kernel matrix for nonlinear dimensionality reduction, in: International Conference on Machine Learning, 2004.
[48] Wiskott, L.; Sejnowski, T., Slow feature analysis: unsupervised learning of invariances, Neural Computation, 14, 4, 715-770 (2002) · Zbl 0994.68591
[49] Yang, J.; Zhang, D.; Yang, J. Y.; Niu, B., Globally maximizing, locally minimizing: unsupervised discriminant projection with applications to face and palm biometrics, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 4, 650-664 (2007)
[50] Yan, S. C.; Xu, D.; Zhang, B. Y.; Zhang, H. J.; Yang, Q. A.; Lin, S., Graph embedding and extensions: a general framework for dimensionality reduction, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1, 40-51 (2007)
[51] Yin, H. J.; Huang, W. L., Adaptive nonlinear manifolds and their applications to pattern recognition, Information Sciences, 180, 14, 2649-2662 (2010) · Zbl 1205.68354
[52] Zhang, Z. Y.; Zha, H. Y., Principal manifolds and nonlinear dimensionality reduction via tangent space alignment, SIAM Journal of Scientific Computing, 26, 1, 313-338 (2004) · Zbl 1077.65042
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.