Document Zbl 1273.62134

Sasaki, Hiroaki; Gutmann, Michael U.; Shouno, Hayaru; Hyvärinen, Aapo

Correlated topographic analysis: estimating an ordering of correlated components. (English) Zbl 1273.62134

Mach. Learn. 92, No. 2-3, 285-317 (2013).

Summary: This paper describes a novel method, which we call correlated topographic analysis (CTA), to estimate non-Gaussian components and their ordering (topography). The method is inspired by a central motivation of recent variants of independent component analysis (ICA), namely, to make use of the residual statistical dependency which ICA cannot remove. We assume that components nearby on the topographic arrangement have both linear and energy correlations, while far-away components are statistically independent. We use these dependencies to fix the ordering of the components. We start by proposing the generative model for the components. Then, we derive an approximation of the likelihood based on the model. Furthermore, since gradient methods tend to get stuck in local optima, we propose a three-step optimization method which dramatically improves topographic estimation. Using simulated data, we show that CTA estimates an ordering of the components and generalizes a previous method in terms of topography estimation. Finally, to demonstrate that CTA is widely applicable, we learn topographic representations for three kinds of real data: natural images, outputs of simulated complex cells and text data.

Cited in 1 Document

MSC:

62H20	Measures of association (correlation, canonical correlation, etc.)
62H25	Factor analysis and principal components; correspondence analysis
68T50	Natural language processing

Keywords:

independent component analysis; topographic representation; natural image statistics; higher-order features; natural language processing

Software:

Python; WordNet; NLTK

Cite Review PDF

Full Text: DOI

References:

[1]	Amari, S.; Cichocki, A.; Yang, H. H., A new learning algorithm for blind signal separation, No. 8, 757-763 (1996)
[2]	Andrews, D. F., & Mallows, C. L. (1974). Scale mixtures of normal distributions. Journal of the Royal Statistical Society. Series B (Methodological), 36(1), 99-102. · Zbl 0282.62017
[3]	Bach, F. R., & Jordan, M. I. (2003). Beyond independent components: trees and clusters. Journal of Machine Learning Research, 4, 1205-1233. · Zbl 1061.62095
[4]	Bell, A. J., & Sejnowski, T. J. (1997). The “independent components” of natural scenes are edge filters. Vision Research, 37(23), 3327-3338. · doi:10.1016/S0042-6989(97)00121-1
[5]	Bellman, R. E. (1957). Dynamic programming. Princeton: Princeton University Press. · Zbl 0077.13605
[6]	Bellman, R. E., & Dreyfus, S. E. (1962). Applied dynamic programming. Princeton: Princeton University Press. · Zbl 0106.34901
[7]	Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python. Sebastopol: O’Reilly Media. · Zbl 1187.68630
[8]	Coen-Cagli, R., Dayan, P., & Schwartz, O. (2012). Cortical surround interactions and perceptual salience via natural scene statistics. PLoS Computational Biology, 8(3), e1002405. · doi:10.1371/journal.pcbi.1002405
[9]	Comon, P. (1994). Independent component analysis, a new concept? Signal Processing, 36(3), 287-314. · Zbl 0791.62004 · doi:10.1016/0165-1684(94)90029-9
[10]	Fellbaum, C. (1998). WordNet: an electronic lexical database. Cambridge: MIT. · Zbl 0913.68054
[11]	Gómez-Herrero, G., Atienza, M., Egiazarian, K., & Cantero, J. L. (2008). Measuring directional coupling between EEG sources. NeuroImage, 43(3), 497-508. · doi:10.1016/j.neuroimage.2008.07.032
[12]	Gutmann, M. U., & Hyvärinen, A. (2012). Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. Journal of Machine Learning Research, 13, 307-361. · Zbl 1283.62064
[13]	Held, M., & Karp, R. M. (1962). A dynamic programming approach to sequencing problems. Journal of the Society for Industrial and Applied Mathematics, 10(1), 196-210. · Zbl 0106.14103 · doi:10.1137/0110015
[14]	Honkela, T., Hyvärinen, A., & Väyrynen, J. J. (2010). WordICA—emergence of linguistic representations for words by independent component analysis. Natural Language Engineering, 16(03), 277-308. · doi:10.1017/S1351324910000057
[15]	Hoyer, P. O., & Hyvärinen, A. (2002). A multi-layer sparse coding network learns contour coding from natural images. Vision Research, 42(12), 1593-1605. · doi:10.1016/S0042-6989(02)00017-2
[16]	Hyvärinen, A. (2006). Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6, 695-708. · Zbl 1222.62051
[17]	Hyvärinen, A., & Hoyer, P. O. (2001). A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images. Vision Research, 41(18), 2413-2423. · doi:10.1016/S0042-6989(01)00114-6
[18]	Hyvärinen, A., & Oja, E. (2000). Independent component analysis: algorithms and applications. Neural Networks, 13(4-5), 411-430. · doi:10.1016/S0893-6080(00)00026-5
[19]	Hyvärinen, A., Hoyer, P. O., & Inki, M. (2001). Topographic independent component analysis. Neural Computation, 13(7), 1527-1558. · Zbl 1009.62049 · doi:10.1162/089976601750264992
[20]	Hyvärinen, A., Gutmann, M., & Hoyer, P. O. (2005). Statistical model of natural stimuli predicts edge-like pooling of spatial frequency channels in V2. BMC Neuroscience, 6, 12. · doi:10.1186/1471-2202-6-12
[21]	Hyvärinen, A., Hurri, J., & Hoyer, P. O. (2009). Natural image statistics: a probabilistic approach to early computational vision. Berlin: Springer. · Zbl 1178.68622
[22]	Isserlis, L. (1918). On a formula for the product-moment coefficient of any order of a normal frequency distribution in any number of variables. Biometrika, 12(1/2), 134-139. · doi:10.2307/2331932
[23]	Karklin, Y., & Lewicki, M. S. (2005). A hierarchical Bayesian model for learning nonlinear statistical regularities in nonstationary natural signals. Neural Computation, 17(2), 397-423. · Zbl 1092.93614 · doi:10.1162/0899766053011474
[24]	Kavukcuoglu, K.; Ranzato, M. A.; Fergus, R.; Le-Cun, Y., Learning invariant features through topographic filter maps, 1605-1612 (2009), New York · doi:10.1109/CVPR.2009.5206545
[25]	Kolenda, T.; Hansen, L. K.; Sigurdsson, S., Independent components in text, 229-250 (2000), Berlin
[26]	Mairal, J., Jenatton, R., Obozinski, G., & Bach, F. (2011). Convex and network flow optimization for structured sparsity. Journal of Machine Learning Research, 12, 2681-2720. · Zbl 1280.68179
[27]	Michalowicz, J. V., Nichols, J. M., Bucholtz, F., & Olson, C. C. (2009). An Isserlis’ theorem for mixed Gaussian variables: application to the auto-bispectral density. Journal of Statistical Physics, 136(1), 89-102. · Zbl 1179.60047 · doi:10.1007/s10955-009-9768-3
[28]	Miller, G. A. (1995). Wordnet: a lexical database for English. Communications of the ACM, 38(11), 39-41. · doi:10.1145/219717.219748
[29]	Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607-609. · doi:10.1038/381607a0
[30]	Osindero, S., Welling, M., & Hinton, G. E. (2006). Topographic product models applied to natural scene statistics. Neural Computation, 18(2), 381-414. · Zbl 1095.68648 · doi:10.1162/089976606775093936
[31]	Rasmussen, C. E. (2006). Conjugate gradient algorithm, version 2006-09-08.
[32]	Simoncelli, E. P., Modeling the joint statistics of images in the wavelet domain, No. 3813, 188-195 (1999)
[33]	Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society. Series B. Statistical Methodology, 67(1), 91-108. · Zbl 1060.62049 · doi:10.1111/j.1467-9868.2005.00490.x
[34]	Vigário, R., Särelä, J., Jousmäki, V., Hämäläinen, M., & Oja, E. (2000). Independent component approach to the analysis of EEG and MEG recordings. IEEE Transactions on Biomedical Engineering, 47(5), 589-593. · doi:10.1109/10.841330
[35]	Zoran, D.; Weiss, Y., The “tree-dependent components” of natural images are edge filters, No. 22, 2340-2348 (2009) · Zbl 1211.05147

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.