Document Zbl 1521.92015

Sanchez-Garcia, Melani; Chauhan, Tushar; Cottereau, Benoit R.; Beyeler, Michael

Efficient multi-scale representation of visual objects using a biologically plausible spike-latency code and winner-take-all inhibition. (English) Zbl 1521.92015

Biol. Cybern. 117, No. 1-2, 95-111 (2023).

Summary: Deep neural networks have surpassed human performance in key visual challenges such as object recognition, but require a large amount of energy, computation, and memory. In contrast, spiking neural networks (SNNs) have the potential to improve both the efficiency and biological plausibility of object recognition systems. Here we present a SNN model that uses spike-latency coding and winner-take-all inhibition (WTA-I) to efficiently represent visual stimuli using multi-scale parallel processing. Mimicking neuronal response properties in early visual cortex, images were preprocessed with three different spatial frequency (SF) channels, before they were fed to a layer of spiking neurons whose synaptic weights were updated using spike-timing-dependent-plasticity. We investigate how the quality of the represented objects changes under different SF bands and WTA-I schemes. We demonstrate that a network of 200 spiking neurons tuned to three SFs can efficiently represent objects with as little as 15 spikes per neuron. Studying how core object recognition may be implemented using biologically plausible learning rules in SNNs may not only further our understanding of the brain, but also lead to novel and efficient artificial vision systems.

MSC:

92B20

Neural networks for/in biological studies, artificial life and related topics

Keywords:

spiking neural networks; spike-timing-dependent-plasticity; multi-scale processing; spike-latency code; winner-take-all inhibition

Software:

DeepID3; CIFAR; Fashion-MNIST; MNIST

Cite Review PDF

Full Text: DOI arXiv

References:

[1]	Ales, JM; Appelbaum, LG; Cottereau, BR, The time course of shape discrimination in the human brain, Neuroimage, 67, 77-88 (2013) · doi:10.1016/j.neuroimage.2012.10.044
[2]	Beyeler, M.; Dutt, ND; Krichmar, JL, Categorization and decision-making in a neurobiologically plausible spiking network using a STDP-like learning rule, Neural Netw, 48, 109-24 (2013) · doi:10.1016/j.neunet.2013.07.012
[3]	Beyeler, M.; Dutt, N.; Krichmar, JL, 3D visual response properties of MSTd emerge from an efficient, sparse population code, J Neurosci, 36, 32, 8399-8415 (2016) · doi:10.1523/JNEUROSCI.0396-16.2016
[4]	Beyeler, M.; Rounds, E.; Carlson, K., Neural correlates of sparse coding and dimensionality reduction, PLoS Comput Biol, 15, 6 (2019) · doi:10.1371/journal.pcbi.1006908
[5]	Bi GQ, Poo MM (1998) Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type. J Neurosci 18(24):10,464-10,472
[6]	Bing, Z.; Baumann, I.; Jiang, Z., Supervised learning in snn via reward-modulated spike-timing-dependent plasticity for a target reaching vehicle, Front Neurorobot, 13, 18 (2019) · doi:10.3389/fnbot.2019.00018
[7]	Brzosko, Z.; Mierau, SB; Paulsen, O., Neuromodulation of spike-timing-dependent plasticity: past, present, and future, Neuron, 103, 4, 563-581 (2019) · doi:10.1016/j.neuron.2019.05.041
[8]	Campbell, Fergus W. The transmission of spatial information through the visual system. From Theoretical Physics to Biology. Karger Publishers, 1973. 374-384
[9]	Caporale, N.; Dan, Y., Spike timing-dependent plasticity: a hebbian learning rule, Annu Rev Neurosci, 31, 1, 25-46 (2008) · doi:10.1146/annurev.neuro.31.060407.125639
[10]	Chang, L.; Tsao, DY, The code for facial identity in the primate brain, Cell, 169, 6, 1013-1028 (2017) · doi:10.1016/j.cell.2017.05.011
[11]	Chauhan T, Masquelier T, Montlibert A et al (2018) Emergence of binocular disparity selectivity through Hebbian learning. J Neurosci 38(44):9563-9578
[12]	Chauhan, T.; Masquelier, T.; Cottereau, BR, Sub-optimality of the early visual system explained through biologically plausible plasticity, Front Neurosci, 15 (2021) · doi:10.3389/fnins.2021.727448
[13]	Cichy, RM; Pantazis, D.; Oliva, A., Similarity-based fusion of meg and fmri reveals spatio-temporal dynamics in human cortex during visual object recognition, Cereb Cortex, 26, 8, 3563-3579 (2016) · doi:10.1093/cercor/bhw135
[14]	De Valois, RL; Albrecht, DG; Thorell, LG, Spatial frequency selectivity of cells in macaque visual cortex, Vis Res, 22, 5, 545-559 (1982) · doi:10.1016/0042-6989(82)90113-4
[15]	De Valois, RL; Albrecht, DG; Thorell, LG, Spatial frequency selectivity of cells in macaque visual cortex, Vis Res, 22, 5, 545-559 (1982) · doi:10.1016/0042-6989(82)90113-4
[16]	Delorme, A.; Thorpe, SJ, Face identification using one spike per neuron: resistance to image degradations, Neural Netw, 14, 6-7, 795-803 (2001) · doi:10.1016/S0893-6080(01)00049-1
[17]	Derrington, A.; Lennie, P., The influence of temporal frequency and adaptation level on receptive field organization of retinal ganglion cells in cat, J Physiol, 333, 1, 343-366 (1982) · doi:10.1113/jphysiol.1982.sp014457
[18]	Derrington, A.; Lennie, P.; Wright, M., The mechanism of peripherally evoked responses in retinal ganglion cells, J Physiol, 289, 1, 299-310 (1979) · doi:10.1113/jphysiol.1979.sp012738
[19]	DiCarlo, J.; Zoccolan, D.; Rust, N., How does the brain solve visual object recognition?, Neuron, 73, 3, 415-434 (2012) · doi:10.1016/j.neuron.2012.01.010
[20]	Diehl, PU; Cook, M., Unsupervised learning of digit recognition using spike-timing-dependent plasticity, Front Comput Neurosci (2015) · doi:10.3389/fncom.2015.00099
[21]	Enroth-Cugell, C.; Robson, JG, The contrast sensitivity of retinal ganglion cells of the cat, J Physiol, 187, 3, 517-552 (1966) · doi:10.1113/jphysiol.1966.sp008107
[22]	Falez P, Tirilly P, Bilasco IM, et al (2019) Multi-layered spiking neural network with target timestamp threshold adaptation and stdp. In: 2019 international joint conference on neural networks (IJCNN). IEEE, pp 1-8
[23]	Feldman, DE, The spike-timing dependence of plasticity, Neuron, 75, 4, 556-571 (2012) · doi:10.1016/j.neuron.2012.08.001
[24]	Field, DJ, Relations between the statistics of natural images and the response properties of cortical cells, Josa A, 4, 12, 2379-2394 (1987) · doi:10.1364/JOSAA.4.002379
[25]	Fu, Q.; Dong, H., An ensemble unsupervised spiking neural network for objective recognition, Neurocomputing, 419, 47-58 (2021) · doi:10.1016/j.neucom.2020.07.109
[26]	Gerstner, W.; Kistler, WM, Spiking neuron models: single neurons, populations, plasticity (2002), Cambridge: Cambridge University Press, Cambridge · Zbl 1100.92501 · doi:10.1017/CBO9780511815706
[27]	Ginsburg AP (1986) Spatial filtering and visual form perception. Handbook of Perception and Human Performance, Vol 2 Cognitive Processes and Performance
[28]	Goel A, Tung C, Lu YH, et al (2020) A survey of methods for low-power deep learning and computer vision. In: 2020 IEEE 6th world forum on internet of things (WF-IoT). IEEE, pp 1-6
[29]	Gütig, R.; Aharonov, R.; Rotter, S., Learning input correlations through nonlinear temporally asymmetric hebbian plasticity, J Neurosci, 23, 9, 3697-3714 (2003) · doi:10.1523/JNEUROSCI.23-09-03697.2003
[30]	Gütig, R.; Aharonov, R.; Rotter, S., Learning input correlations through nonlinear temporally asymmetric hebbian plasticity, J Neurosci, 23, 9, 3697-3714 (2003) · doi:10.1523/JNEUROSCI.23-09-03697.2003
[31]	Hao, Y.; Huang, X.; Dong, M., A biologically plausible supervised learning method for spiking neural networks using the symmetric stdp rule, Neural Netw, 121, 387-395 (2020) · doi:10.1016/j.neunet.2019.09.007
[32]	He K, Zhang X, Ren S, et al (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026-1034
[33]	Henriksson, L.; Nurminen, L.; Hyvärinen, A., Spatial frequency tuning in human retinotopic visual areas, J Vis, 8, 10, 5-5 (2008) · doi:10.1167/8.10.5
[34]	Hughes, HC; Nozawa, G.; Kitterle, F., Global precedence, spatial frequency channels, and the statistics of natural images, J Cognit Neurosci, 8, 3, 197-230 (1996) · doi:10.1162/jocn.1996.8.3.197
[35]	Jiang, P.; Ergu, D.; Liu, F., A review of yolo algorithm developments, Procedia Comput Sci, 199, 1066-1073 (2022) · doi:10.1016/j.procs.2022.01.135
[36]	Kauffmann, L.; Ramanoël, S.; Peyrin, C., The neural bases of spatial frequency processing during scene perception, Front Integr Neurosci, 8, 37 (2014) · doi:10.3389/fnint.2014.00037
[37]	Kheradpisheh, SR; Ganjtabesh, M.; Thorpe, SJ, Stdp-based spiking deep convolutional neural networks for object recognition, Neural Netw, 99, 56-67 (2018) · doi:10.1016/j.neunet.2017.12.005
[38]	Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Tech. Rep. 0, University of Toronto, Toronto, Ontario
[39]	LeCun Y (1998) The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/
[40]	Liu D, Yue S (2016) Visual pattern recognition using unsupervised spike timing dependent plasticity learning. In: 2016 international joint conference on neural networks (IJCNN). IEEE, pp 285-292
[41]	Liu, Q.; Pan, G.; Ruan, H., Unsupervised aer object recognition based on multiscale spatio-temporal features and spiking neurons, IEEE Trans Neural Netw Learn Syst, 31, 12, 5300-5311 (2020) · doi:10.1109/TNNLS.2020.2966058
[42]	Maass, W., On the computational power of winner-take-all, Neural Comput, 12, 11, 2519-2535 (2000) · doi:10.1162/089976600300014827
[43]	Majaj, NJ; Hong, H.; Solomon, EA, Simple learned weighted sums of inferior temporal neuronal firing rates accurately predict human core object recognition performance, J Neurosci, 35, 39, 13,402-13,418 (2015) · doi:10.1523/JNEUROSCI.5181-14.2015
[44]	Masquelier, T.; Thorpe, S., Unsupervised learning of visual features through spike timing dependent plasticity, PLoS Comput Biol, 3, 2 (2007) · doi:10.1371/journal.pcbi.0030031
[45]	Mozafari, M.; Ganjtabesh, M.; Nowzari-Dalini, A., Bio-inspired digit recognition using reward-modulated spike-timing-dependent plasticity in deep convolutional networks, Pattern Recognit, 94, 87-95 (2019) · doi:10.1016/j.patcog.2019.05.015
[46]	Nassi JJ, Callaway EM (2009) Parallel processing strategies of the primate visual system. Nat Rev Neurosci 10(5):360-372
[47]	Olshausen, BA; Field, DJ, Sparse coding with an overcomplete basis set: a strategy employed by V1?, Vis Res, 37, 23, 3311-3325 (1997) · doi:10.1016/S0042-6989(97)00169-7
[48]	Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE workshop on applications of computer vision. IEEE, pp 138-142
[49]	Sanchez-Garcia M, Chauhan T, Cottereau BR, et al (2022) Efficient multi-scale representation of visual objects using a biologically plausible spike-latency code and winner-take-all inhibition. arXiv:2212.00081
[50]	Shapley, R.; Lennie, P., Spatial frequency analysis in the visual system, Annu Rev Neurosci, 8, 1, 547-581 (1985) · doi:10.1146/annurev.ne.08.030185.002555
[51]	Solomon, SG; White, AJ; Martin, PR, Extraclassical receptive field properties of parvocellular, magnocellular, and koniocellular cells in the primate lateral geniculate nucleus, J Neurosci, 22, 1, 338-349 (2002) · doi:10.1523/JNEUROSCI.22-01-00338.2002
[52]	Stivaktakis, R.; Tsagkatakis, G.; Tsakalides, P., Deep learning for multilabel land cover scene categorization using data augmentation, IEEE Geosci Remote Sens Lett, 16, 7, 1031-1035 (2019) · doi:10.1109/LGRS.2019.2893306
[53]	Stuijt, J.; Sifalakis, M.; Yousefzadeh, A., \( \mu\) brain: an event-driven and fully synthesizable architecture for spiking neural networks, Front Neurosci, 15, 538 (2021) · doi:10.3389/fnins.2021.664208
[54]	Sun Y, Liang D, Wang X, et al (2015) Deepid3: face recognition with very deep neural networks. arXiv:1502.00873
[55]	Tolhurst, DJ; Tadmor, Y.; Chao, T., Amplitude spectra of natural images, Ophthalmic Physiol Opt, 12, 2, 229-232 (1992) · doi:10.1111/j.1475-1313.1992.tb00296.x
[56]	Vigneron A, Martinet J (2020) A critical survey of stdp in spiking neural networks for pattern recognition. In: 2020 international joint conference on neural networks (IJCNN). IEEE, pp 1-9
[57]	Vinje, WE; Gallant, JL, Sparse coding and decorrelation in primary visual cortex during natural vision, Science, 287, 5456, 1273-1276 (2000) · doi:10.1126/science.287.5456.1273
[58]	Xiao H, Rasul K, Vollgraf R (2017) Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv:1708.07747
[59]	Yu, Q.; Tang, H.; Tan, KC, Rapid feedforward computation by temporal encoding and learning with spiking neurons, IEEE Trans Neural Netw Learn Syst, 24, 10, 1539-1552 (2013) · doi:10.1109/TNNLS.2013.2245677
[60]	Zhou, Q.; Li, X., A bio-inspired hierarchical spiking neural network with reward-modulated stdp learning rule for aer object recognition, IEEE Sens J, 22, 16, 16,323-16,338 (2022) · doi:10.1109/JSEN.2022.3189679

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.