subscribe to arXiv mailings

doi 10.1371/journal.pcbi.1003963

Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition

Authors: Charles F. Cadieu, Ha Hong, Daniel L. K. Yamins, Nicolas Pinto, Diego Ardila, Ethan A. Solomon, Najib J. Majaj, James J. DiCarlo

Abstract: The primate visual system achieves remarkable visual object recognition performance even in brief presentations and under changes to object exemplar, geometric transformations, and background variation (a.k.a. core visual object recognition). This remarkable performance is mediated by the representation formed in inferior temporal (IT) cortex. In parallel, recent advances in machine learning have… ▽ More The primate visual system achieves remarkable visual object recognition performance even in brief presentations and under changes to object exemplar, geometric transformations, and background variation (a.k.a. core visual object recognition). This remarkable performance is mediated by the representation formed in inferior temporal (IT) cortex. In parallel, recent advances in machine learning have led to ever higher performing models of object recognition using artificial deep neural networks (DNNs). It remains unclear, however, whether the representational performance of DNNs rivals that of the brain. To accurately produce such a comparison, a major difficulty has been a unifying metric that accounts for experimental limitations such as the amount of noise, the number of neural recording sites, and the number trials, and computational limitations such as the complexity of the decoding classifier and the number of classifier training examples. In this work we perform a direct comparison that corrects for these experimental limitations and computational considerations. As part of our methodology, we propose an extension of "kernel analysis" that measures the generalization accuracy as a function of representational complexity. Our evaluations show that, unlike previous bio-inspired models, the latest DNNs rival the representational performance of IT cortex on this visual object recognition task. Furthermore, we show that models that perform well on measures of representational performance also perform well on measures of representational similarity to IT and on measures of predicting individual IT multi-unit responses. Whether these DNNs rely on computational mechanisms similar to the primate visual system is yet to be determined, but, unlike all previous bio-inspired models, that possibility cannot be ruled out merely on representational performance grounds. △ Less

Submitted 12 June, 2014; originally announced June 2014.

Comments: 35 pages, 12 figures, extends and expands upon arXiv:1301.3530

arXiv:1301.3530 [pdf, other]

The Neural Representation Benchmark and its Evaluation on Brain and Machine

Authors: Charles F. Cadieu, Ha Hong, Dan Yamins, Nicolas Pinto, Najib J. Majaj, James J. DiCarlo

Abstract: A key requirement for the development of effective learning representations is their evaluation and comparison to representations we know to be effective. In natural sensory domains, the community has viewed the brain as a source of inspiration and as an implicit benchmark for success. However, it has not been possible to directly test representational learning algorithms directly against the repr… ▽ More A key requirement for the development of effective learning representations is their evaluation and comparison to representations we know to be effective. In natural sensory domains, the community has viewed the brain as a source of inspiration and as an implicit benchmark for success. However, it has not been possible to directly test representational learning algorithms directly against the representations contained in neural systems. Here, we propose a new benchmark for visual representations on which we have directly tested the neural representation in multiple visual cortical areas in macaque (utilizing data from [Majaj et al., 2012]), and on which any computer vision algorithm that produces a feature space can be tested. The benchmark measures the effectiveness of the neural or machine representation by computing the classification loss on the ordered eigendecomposition of a kernel matrix [Montavon et al., 2011]. In our analysis we find that the neural representation in visual area IT is superior to visual area V4. In our analysis of representational learning algorithms, we find that three-layer models approach the representational performance of V4 and the algorithm in [Le et al., 2012] surpasses the performance of V4. Impressively, we find that a recent supervised algorithm [Krizhevsky et al., 2012] achieves performance comparable to that of IT for an intermediate level of image variation difficulty, and surpasses IT at a higher difficulty level. We believe this result represents a major milestone: it is the first learning algorithm we have found that exceeds our current estimate of IT representation performance. We hope that this benchmark will assist the community in matching the representational performance of visual cortex and will serve as an initial rallying point for further correspondence between representations derived in brains and machines. △ Less

Submitted 25 January, 2013; v1 submitted 15 January, 2013; originally announced January 2013.

Comments: The v1 version contained incorrectly computed kernel analysis curves and KA-AUC values for V4, IT, and the HT-L3 models. They have been corrected in this version

arXiv:1011.4058 [pdf, other]

Modeling Image Structure with Factorized Phase-Coupled Boltzmann Machines

Authors: Charles F. Cadieu, Kilian Koepsell

Abstract: We describe a model for capturing the statistical structure of local amplitude and local spatial phase in natural images. The model is based on a recently developed, factorized third-order Boltzmann machine that was shown to be effective at capturing higher-order structure in images by modeling dependencies among squared filter outputs (Ranzato and Hinton, 2010). Here, we extend this model to… ▽ More We describe a model for capturing the statistical structure of local amplitude and local spatial phase in natural images. The model is based on a recently developed, factorized third-order Boltzmann machine that was shown to be effective at capturing higher-order structure in images by modeling dependencies among squared filter outputs (Ranzato and Hinton, 2010). Here, we extend this model to $L_p$-spherically symmetric subspaces. In order to model local amplitude and phase structure in images, we focus on the case of two dimensional subspaces, and the $L_2$-norm. When trained on natural images the model learns subspaces resembling quadrature-pair Gabor filters. We then introduce an additional set of hidden units that model the dependencies among subspace phases. These hidden units form a combinatorial mixture of phase coupling distributions, concentrated in the sum and difference of phase pairs. When adapted to natural images, these distributions capture local spatial phase structure in natural images. △ Less

Submitted 17 November, 2010; originally announced November 2010.

Comments: 11 pages, 6 figures

arXiv:0906.3844 [pdf, other]

Phase coupling estimation from multivariate phase statistics

Authors: Charles F. Cadieu, Kilian Koepsell

Abstract: Coupled oscillators are prevalent throughout the physical world. Dynamical system formulations of weakly coupled oscillator systems have proven effective at capturing the properties of real-world systems. However, these formulations usually deal with the `forward problem': simulating a system from known coupling parameters. Here we provide a solution to the `inverse problem': determining the cou… ▽ More Coupled oscillators are prevalent throughout the physical world. Dynamical system formulations of weakly coupled oscillator systems have proven effective at capturing the properties of real-world systems. However, these formulations usually deal with the `forward problem': simulating a system from known coupling parameters. Here we provide a solution to the `inverse problem': determining the coupling parameters from measurements. Starting from the dynamic equations of a system of coupled phase oscillators, given by a nonlinear Langevin equation, we derive the corresponding equilibrium distribution. This formulation leads us to the maximum entropy distribution that captures pair-wise phase relationships. To solve the inverse problem for this distribution, we derive a closed form solution for estimating the phase coupling parameters from observed phase statistics. Through simulations, we show that the algorithm performs well in high dimensions (d=100) and in cases with limited data (as few as 100 samples per dimension). Because the distribution serves as the unique maximum entropy solution for pairwise phase statistics, the distribution and estimation technique can be broadly applied to phase coupling estimation in any system of phase oscillators. △ Less

Submitted 21 June, 2009; originally announced June 2009.

Comments: revtex, 4 pages, 3 figures

arXiv:0809.4291 [pdf, other]

A multivariate phase distribution and its estimation

Authors: Charles F. Cadieu, Kilian Koepsell

Abstract: Circular variables such as phase or orientation have received considerable attention throughout the scientific and engineering communities and have recently been quite prominent in the field of neuroscience. While many analytic techniques have used phase as an effective representation, there has been little work on techniques that capture the joint statistics of multiple phase variables. In this… ▽ More Circular variables such as phase or orientation have received considerable attention throughout the scientific and engineering communities and have recently been quite prominent in the field of neuroscience. While many analytic techniques have used phase as an effective representation, there has been little work on techniques that capture the joint statistics of multiple phase variables. In this paper we introduce a distribution that captures empirically observed pair-wise phase relationships. Importantly, we have developed a computationally efficient and accurate technique for estimating the parameters of this distribution from data. We show that the algorithm performs well in high-dimensions (d=100), and in cases with limited data (as few as 100 samples per dimension). We also demonstrate how this technique can be applied to electrocorticography (ECoG) recordings to investigate the coupling of brain areas during different behavioral states. This distribution and estimation technique can be broadly applied to any setting that produces multiple circular variables. △ Less

Submitted 21 June, 2009; v1 submitted 24 September, 2008; originally announced September 2008.

Comments: 9 pages, 5 figures, minor change in conventions and minor errors corrected

Showing 1–5 of 5 results for author: Cadieu, C F