subscribe to arXiv mailings

Gradient-free variational learning with conditional mixture networks

Authors: Conor Heins, Hao Wu, Dimitrije Markovic, Alexander Tschantz, Jeff Beck, Christopher Buckley

Abstract: Balancing computational efficiency with robust predictive performance is crucial in supervised learning, especially for critical applications. Standard deep learning models, while accurate and scalable, often lack probabilistic features like calibrated predictions and uncertainty quantification. Bayesian methods address these issues but can be computationally expensive as model and data complexity… ▽ More Balancing computational efficiency with robust predictive performance is crucial in supervised learning, especially for critical applications. Standard deep learning models, while accurate and scalable, often lack probabilistic features like calibrated predictions and uncertainty quantification. Bayesian methods address these issues but can be computationally expensive as model and data complexity increase. Previous work shows that fast variational methods can reduce the compute requirements of Bayesian methods by eliminating the need for gradient computation or sampling, but are often limited to simple models. We demonstrate that conditional mixture networks (CMNs), a probabilistic variant of the mixture-of-experts (MoE) model, are suitable for fast, gradient-free inference and can solve complex classification tasks. CMNs employ linear experts and a softmax gating network. By exploiting conditional conjugacy and Pólya-Gamma augmentation, we furnish Gaussian likelihoods for the weights of both the linear experts and the gating network. This enables efficient variational updates using coordinate ascent variational inference (CAVI), avoiding traditional gradient-based optimization. We validate this approach by training two-layer CMNs on standard benchmarks from the UCI repository. Our method, CAVI-CMN, achieves competitive and often superior predictive accuracy compared to maximum likelihood estimation (MLE) with backpropagation, while maintaining competitive runtime and full posterior distributions over all model parameters. Moreover, as input size or the number of experts increases, computation time scales competitively with MLE and other gradient-based solutions like black-box variational inference (BBVI), making CAVI-CMN a promising tool for deep, fast, and gradient-free Bayesian networks. △ Less

Submitted 29 August, 2024; originally announced August 2024.

Comments: 16 pages main text (3 figures), including references. 9 pages supplementary material (5 figures)

arXiv:2407.20292 [pdf]

From pixels to planning: scale-free active inference

Authors: Karl Friston, Conor Heins, Tim Verbelen, Lancelot Da Costa, Tommaso Salvatori, Dimitrije Markovic, Alexander Tschantz, Magnus Koudahl, Christopher Buckley, Thomas Parr

Abstract: This paper describes a discrete state-space model -- and accompanying methods -- for generative modelling. This model generalises partially observed Markov decision processes to include paths as latent variables, rendering it suitable for active inference and learning in a dynamic setting. Specifically, we consider deep or hierarchical forms using the renormalisation group. The ensuing renormalisi… ▽ More This paper describes a discrete state-space model -- and accompanying methods -- for generative modelling. This model generalises partially observed Markov decision processes to include paths as latent variables, rendering it suitable for active inference and learning in a dynamic setting. Specifically, we consider deep or hierarchical forms using the renormalisation group. The ensuing renormalising generative models (RGM) can be regarded as discrete homologues of deep convolutional neural networks or continuous state-space models in generalised coordinates of motion. By construction, these scale-invariant models can be used to learn compositionality over space and time, furnishing models of paths or orbits; i.e., events of increasing temporal depth and itinerancy. This technical note illustrates the automatic discovery, learning and deployment of RGMs using a series of applications. We start with image classification and then consider the compression and generation of movies and music. Finally, we apply the same variational principles to the learning of Atari-like games. △ Less

Submitted 27 July, 2024; originally announced July 2024.

Comments: 64 pages, 28 figures

MSC Class: 92 ACM Class: F.1.1

arXiv:2407.13083 [pdf, other]

Modeling and Driving Human Body Soundfields through Acoustic Primitives

Authors: Chao Huang, Dejan Markovic, Chenliang Xu, Alexander Richard

Abstract: While rendering and animation of photorealistic 3D human body models have matured and reached an impressive quality over the past years, modeling the spatial audio associated with such full body models has been largely ignored so far. In this work, we present a framework that allows for high-quality spatial audio generation, capable of rendering the full 3D soundfield generated by a human body, in… ▽ More While rendering and animation of photorealistic 3D human body models have matured and reached an impressive quality over the past years, modeling the spatial audio associated with such full body models has been largely ignored so far. In this work, we present a framework that allows for high-quality spatial audio generation, capable of rendering the full 3D soundfield generated by a human body, including speech, footsteps, hand-body interactions, and others. Given a basic audio-visual representation of the body in form of 3D body pose and audio from a head-mounted microphone, we demonstrate that we can render the full acoustic scene at any point in 3D space efficiently and accurately. To enable near-field and realtime rendering of sound, we borrow the idea of volumetric primitives from graphical neural rendering and transfer them into the acoustic domain. Our acoustic primitives result in an order of magnitude smaller soundfield representations and overcome deficiencies in near-field rendering compared to previous approaches. △ Less

Submitted 20 July, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

Comments: ECCV 2024. Project Page: https://wikichao.github.io/Acoustic-Primitives/

arXiv:2406.03372 [pdf, other]

Training of Physical Neural Networks

Authors: Ali Momeni, Babak Rahmani, Benjamin Scellier, Logan G. Wright, Peter L. McMahon, Clara C. Wanjura, Yuhang Li, Anas Skalli, Natalia G. Berloff, Tatsuhiro Onodera, Ilker Oguz, Francesco Morichetti, Philipp del Hougne, Manuel Le Gallo, Abu Sebastian, Azalia Mirhoseini, Cheng Zhang, Danijela Marković, Daniel Brunner, Christophe Moser, Sylvain Gigan, Florian Marquardt, Aydogan Ozcan, Julie Grollier, Andrea J. Liu , et al. (3 additional authors not shown)

Abstract: Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also… ▽ More Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also have them perform inference locally and privately on edge devices, such as smartphones or sensors? Research over the past few years has shown that the answer to all these questions is likely "yes, with enough research": PNNs could one day radically change what is possible and practical for AI systems. To do this will however require rethinking both how AI models work, and how they are trained - primarily by considering the problems through the constraints of the underlying hardware physics. To train PNNs at large scale, many methods including backpropagation-based and backpropagation-free approaches are now being explored. These methods have various trade-offs, and so far no method has been shown to scale to the same scale and performance as the backpropagation algorithm widely used in deep learning today. However, this is rapidly changing, and a diverse ecosystem of training techniques provides clues for how PNNs may one day be utilized to create both more efficient realizations of current-scale AI models, and to enable unprecedented-scale models. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 29 pages, 4 figures

arXiv:2402.14460 [pdf, ps, other]

Reframing the Expected Free Energy: Four Formulations and a Unification

Authors: Théophile Champion, Howard Bowman, Dimitrije Marković, Marek Grześ

Abstract: Active inference is a leading theory of perception, learning and decision making, which can be applied to neuroscience, robotics, psychology, and machine learning. Active inference is based on the expected free energy, which is mostly justified by the intuitive plausibility of its formulations, e.g., the risk plus ambiguity and information gain / pragmatic value formulations. This paper seek to fo… ▽ More Active inference is a leading theory of perception, learning and decision making, which can be applied to neuroscience, robotics, psychology, and machine learning. Active inference is based on the expected free energy, which is mostly justified by the intuitive plausibility of its formulations, e.g., the risk plus ambiguity and information gain / pragmatic value formulations. This paper seek to formalize the problem of deriving these formulations from a single root expected free energy definition, i.e., the unification problem. Then, we study two settings, each one having its own root expected free energy definition. In the first setting, no justification for the expected free energy has been proposed to date, but all the formulations can be recovered from it. However, in this setting, the agent cannot have arbitrary prior preferences over observations. Indeed, only a limited class of prior preferences over observations is compatible with the likelihood mapping of the generative model. In the second setting, a justification of the root expected free energy definition is known, but this setting only accounts for two formulations, i.e., the risk over states plus ambiguity and entropy plus expected energy formulations. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: 17 pages, 2 figures

arXiv:2312.11186 [pdf, ps, other]

An epistemic logic for modeling decisions in the context of incomplete knowledge

Authors: Đorđe Marković, Simon Vandevelde, Linde Vanbesien, Joost Vennekens, Marc Denecker

Abstract: Substantial efforts have been made in developing various Decision Modeling formalisms, both from industry and academia. A challenging problem is that of expressing decision knowledge in the context of incomplete knowledge. In such contexts, decisions depend on what is known or not known. We argue that none of the existing formalisms for modeling decisions are capable of correctly capturing the epi… ▽ More Substantial efforts have been made in developing various Decision Modeling formalisms, both from industry and academia. A challenging problem is that of expressing decision knowledge in the context of incomplete knowledge. In such contexts, decisions depend on what is known or not known. We argue that none of the existing formalisms for modeling decisions are capable of correctly capturing the epistemic nature of such decisions, inevitably causing issues in situations of uncertainty. This paper presents a new language for modeling decisions with incomplete knowledge. It combines three principles: stratification, autoepistemic logic, and definitions. A knowledge base in this language is a hierarchy of epistemic theories, where each component theory may epistemically reason on the knowledge in lower theories, and decisions are made using definitions with epistemic conditions. △ Less

Submitted 18 December, 2023; originally announced December 2023.

Comments: 9 pages, 3 figures, to be published as a poster version in the ACM/SIGAPP conference

arXiv:2311.10300 [pdf, other]

Supervised structure learning

Authors: Karl J. Friston, Lancelot Da Costa, Alexander Tschantz, Alex Kiefer, Tommaso Salvatori, Victorita Neacsu, Magnus Koudahl, Conor Heins, Noor Sajid, Dimitrije Markovic, Thomas Parr, Tim Verbelen, Christopher L Buckley

Abstract: This paper concerns structure learning or discovery of discrete generative models. It focuses on Bayesian model selection and the assimilation of training data or content, with a special emphasis on the order in which data are ingested. A key move - in the ensuing schemes - is to place priors on the selection of models, based upon expected free energy. In this setting, expected free energy reduces… ▽ More This paper concerns structure learning or discovery of discrete generative models. It focuses on Bayesian model selection and the assimilation of training data or content, with a special emphasis on the order in which data are ingested. A key move - in the ensuing schemes - is to place priors on the selection of models, based upon expected free energy. In this setting, expected free energy reduces to a constrained mutual information, where the constraints inherit from priors over outcomes (i.e., preferred outcomes). The resulting scheme is first used to perform image classification on the MNIST dataset to illustrate the basic idea, and then tested on a more challenging problem of discovering models with dynamics, using a simple sprite-based visual disentanglement paradigm and the Tower of Hanoi (cf., blocks world) problem. In these examples, generative models are constructed autodidactically to recover (i.e., disentangle) the factorial structure of latent states - and their characteristic paths or dynamics. △ Less

Submitted 16 November, 2023; originally announced November 2023.

arXiv:2311.06285 [pdf, other]

Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio

Authors: Xudong Xu, Dejan Markovic, Jacob Sandakly, Todd Keebler, Steven Krenn, Alexander Richard

Abstract: While 3D human body modeling has received much attention in computer vision, modeling the acoustic equivalent, i.e. modeling 3D spatial audio produced by body motion and speech, has fallen short in the community. To close this gap, we present a model that can generate accurate 3D spatial audio for full human bodies. The system consumes, as input, audio signals from headset microphones and body pos… ▽ More While 3D human body modeling has received much attention in computer vision, modeling the acoustic equivalent, i.e. modeling 3D spatial audio produced by body motion and speech, has fallen short in the community. To close this gap, we present a model that can generate accurate 3D spatial audio for full human bodies. The system consumes, as input, audio signals from headset microphones and body pose, and produces, as output, a 3D sound field surrounding the transmitter's body, from which spatial audio can be rendered at any arbitrary position in the 3D space. We collect a first-of-its-kind multimodal dataset of human bodies, recorded with multiple cameras and a spherical array of 345 microphones. In an empirical evaluation, we demonstrate that our model can produce accurate body-induced sound fields when trained with a suitable loss. Dataset and code are available online. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

arXiv:2309.12095 [pdf, other]

doi 10.1109/ACCESS.2024.3417219

Bayesian sparsification for deep neural networks with Bayesian model reduction

Authors: Dimitrije Marković, Karl J. Friston, Stefan J. Kiebel

Abstract: Deep learning's immense capabilities are often constrained by the complexity of its models, leading to an increasing demand for effective sparsification techniques. Bayesian sparsification for deep learning emerges as a crucial approach, facilitating the design of models that are both computationally efficient and competitive in terms of performance across various deep learning applications. The s… ▽ More Deep learning's immense capabilities are often constrained by the complexity of its models, leading to an increasing demand for effective sparsification techniques. Bayesian sparsification for deep learning emerges as a crucial approach, facilitating the design of models that are both computationally efficient and competitive in terms of performance across various deep learning applications. The state-of-the-art -- in Bayesian sparsification of deep neural networks -- combines structural shrinkage priors on model weights with an approximate inference scheme based on stochastic variational inference. However, model inversion of the full generative model is exceptionally computationally demanding, especially when compared to standard deep learning of point estimates. In this context, we advocate for the use of Bayesian model reduction (BMR) as a more efficient alternative for pruning of model weights. As a generalization of the Savage-Dickey ratio, BMR allows a post-hoc elimination of redundant model weights based on the posterior estimates under a straightforward (non-hierarchical) generative model. Our comparative study highlights the advantages of the BMR method relative to established approaches based on hierarchical horseshoe priors over model weights. We illustrate the potential of BMR across various deep learning architectures, from classical networks like LeNet to modern frameworks such as Vision Transformers and MLP-Mixers. △ Less

Submitted 27 October, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

MSC Class: 68T07

arXiv:2305.18321 [pdf, other]

Training an Ising Machine with Equilibrium Propagation

Authors: Jérémie Laydevant, Danijela Markovic, Julie Grollier

Abstract: Ising machines, which are hardware implementations of the Ising model of coupled spins, have been influential in the development of unsupervised learning algorithms at the origins of Artificial Intelligence (AI). However, their application to AI has been limited due to the complexities in matching supervised training methods with Ising machine physics, even though these methods are essential for a… ▽ More Ising machines, which are hardware implementations of the Ising model of coupled spins, have been influential in the development of unsupervised learning algorithms at the origins of Artificial Intelligence (AI). However, their application to AI has been limited due to the complexities in matching supervised training methods with Ising machine physics, even though these methods are essential for achieving high accuracy. In this study, we demonstrate a novel approach to train Ising machines in a supervised way through the Equilibrium Propagation algorithm, achieving comparable results to software-based implementations. We employ the quantum annealing procedure of the D-Wave Ising machine to train a fully-connected neural network on the MNIST dataset. Furthermore, we demonstrate that the machine's connectivity supports convolution operations, enabling the training of a compact convolutional network with minimal spins per neuron. Our findings establish Ising machines as a promising trainable hardware platform for AI, with the potential to enhance machine learning applications. △ Less

Submitted 22 May, 2023; originally announced May 2023.

arXiv:2211.03659 [pdf]

Multilayer spintronic neural networks with radio-frequency connections

Authors: Andrew Ross, Nathan Leroux, Arnaud de Riz, Danijela Marković, Dédalo Sanz-Hernández, Juan Trastoy, Paolo Bortolotti, Damien Querlioz, Leandro Martins, Luana Benetti, Marcel S. Claro, Pedro Anacleto, Alejandro Schulman, Thierry Taris, Jean-Baptiste Begueret, Sylvain Saïghi, Alex S. Jenkins, Ricardo Ferreira, Adrien F. Vincent, Alice Mizrahi, Julie Grollier

Abstract: Spintronic nano-synapses and nano-neurons perform complex cognitive computations with high accuracy thanks to their rich, reproducible and controllable magnetization dynamics. These dynamical nanodevices could transform artificial intelligence hardware, provided that they implement state-of-the art deep neural networks. However, there is today no scalable way to connect them in multilayers. Here w… ▽ More Spintronic nano-synapses and nano-neurons perform complex cognitive computations with high accuracy thanks to their rich, reproducible and controllable magnetization dynamics. These dynamical nanodevices could transform artificial intelligence hardware, provided that they implement state-of-the art deep neural networks. However, there is today no scalable way to connect them in multilayers. Here we show that the flagship nano-components of spintronics, magnetic tunnel junctions, can be connected into multilayer neural networks where they implement both synapses and neurons thanks to their magnetization dynamics, and communicate by processing, transmitting and receiving radio frequency (RF) signals. We build a hardware spintronic neural network composed of nine magnetic tunnel junctions connected in two layers, and show that it natively classifies nonlinearly-separable RF inputs with an accuracy of 97.7%. Using physical simulations, we demonstrate that a large network of nanoscale junctions can achieve state-of the-art identification of drones from their RF transmissions, without digitization, and consuming only a few milliwatts, which is a gain of more than four orders of magnitude in power consumption compared to currently used techniques. This study lays the foundation for deep, dynamical, spintronic neural networks. △ Less

Submitted 7 November, 2022; originally announced November 2022.

arXiv:2211.01131 [pdf]

Classification of multi-frequency RF signals by extreme learning, using magnetic tunnel junctions as neurons and synapses

Authors: Nathan Leroux, Danijela Marković, Dédalo Sanz-Hernández, Juan Trastoy, Paolo Bortolotti, Alejandro Schulman, Luana Benetti, Alex Jenkins, Ricardo Ferreira, Julie Grollier, Alice Mizrahi

Abstract: Extracting information from radiofrequency (RF) signals using artificial neural networks at low energy cost is a critical need for a wide range of applications from radars to health. These RF inputs are composed of multiples frequencies. Here we show that magnetic tunnel junctions can process analogue RF inputs with multiple frequencies in parallel and perform synaptic operations. Using a backprop… ▽ More Extracting information from radiofrequency (RF) signals using artificial neural networks at low energy cost is a critical need for a wide range of applications from radars to health. These RF inputs are composed of multiples frequencies. Here we show that magnetic tunnel junctions can process analogue RF inputs with multiple frequencies in parallel and perform synaptic operations. Using a backpropagation-free method called extreme learning, we classify noisy images encoded by RF signals, using experimental data from magnetic tunnel junctions functioning as both synapses and neurons. We achieve the same accuracy as an equivalent software neural network. These results are a key step for embedded radiofrequency artificial intelligence. △ Less

Submitted 20 April, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

Comments: 9 pages, 5 figures

arXiv:2209.04473 [pdf, other]

Reconstructing the Dynamic Directivity of Unconstrained Speech

Authors: Camille Noufi, Dejan Markovic, Peter Dodds

Abstract: This article presents a method for estimating and reconstructing the spatial energy distribution pattern of natural speech, which is crucial for achieving realistic vocal presence in virtual communication settings. The method comprises two stages. First, recordings of speech captured by a real, static microphone array are used to create an egocentric virtual array that tracks the movement of the s… ▽ More This article presents a method for estimating and reconstructing the spatial energy distribution pattern of natural speech, which is crucial for achieving realistic vocal presence in virtual communication settings. The method comprises two stages. First, recordings of speech captured by a real, static microphone array are used to create an egocentric virtual array that tracks the movement of the speaker over time. This virtual array is used to measure and encode the high-resolution directivity pattern of the speech signal as it evolves dynamically with natural speech and movement. In the second stage, the encoded directivity representation is utilized to train a machine learning model that can estimate the full, dynamic directivity pattern given a limited set of speech signals, such as those recorded using the microphones on a head-mounted display. Our results show that neural networks can accurately estimate the full directivity pattern of natural, unconstrained speech from limited information. The proposed method for estimating and reconstructing the spatial energy distribution pattern of natural speech, along with the evaluation of various machine learning models and training paradigms, provides an important contribution to the development of realistic vocal presence in virtual communication settings. △ Less

Submitted 5 September, 2023; v1 submitted 9 September, 2022; originally announced September 2022.

Comments: In proceedings of I3DA 2023 - The 2023 International Conference on Immersive and 3D Audio. DOI coming soon

arXiv:2207.03697 [pdf, other]

End-to-End Binaural Speech Synthesis

Authors: Wen Chin Huang, Dejan Markovic, Alexander Richard, Israel Dejene Gebru, Anjali Menon

Abstract: In this work, we present an end-to-end binaural speech synthesis system that combines a low-bitrate audio codec with a powerful binaural decoder that is capable of accurate speech binauralization while faithfully reconstructing environmental factors like ambient noise or reverb. The network is a modified vector-quantized variational autoencoder, trained with several carefully designed objectives,… ▽ More In this work, we present an end-to-end binaural speech synthesis system that combines a low-bitrate audio codec with a powerful binaural decoder that is capable of accurate speech binauralization while faithfully reconstructing environmental factors like ambient noise or reverb. The network is a modified vector-quantized variational autoencoder, trained with several carefully designed objectives, including an adversarial loss. We evaluate the proposed system on an internal binaural dataset with objective metrics and a perceptual study. Results show that the proposed approach matches the ground truth data more closely than previous methods. In particular, we demonstrate the capability of the adversarial loss in capturing environment effects needed to create an authentic auditory scene. △ Less

Submitted 8 July, 2022; originally announced July 2022.

Comments: Accepted to INTERSPEECH 2022. Demo link: https://unilight.github.io/Publication-Demos/publications/e2e-binaural-synthesis

arXiv:2206.15423 [pdf, other]

Implicit Neural Spatial Filtering for Multichannel Source Separation in the Waveform Domain

Authors: Dejan Markovic, Alexandre Defossez, Alexander Richard

Abstract: We present a single-stage casual waveform-to-waveform multichannel model that can separate moving sound sources based on their broad spatial locations in a dynamic acoustic scene. We divide the scene into two spatial regions containing, respectively, the target and the interfering sound sources. The model is trained end-to-end and performs spatial processing implicitly, without any components base… ▽ More We present a single-stage casual waveform-to-waveform multichannel model that can separate moving sound sources based on their broad spatial locations in a dynamic acoustic scene. We divide the scene into two spatial regions containing, respectively, the target and the interfering sound sources. The model is trained end-to-end and performs spatial processing implicitly, without any components based on traditional processing or use of hand-crafted spatial features. We evaluate the proposed model on a real-world dataset and show that the model matches the performance of an oracle beamformer followed by a state-of-the-art single-channel enhancement network. △ Less

Submitted 30 June, 2022; originally announced June 2022.

Comments: Interspeech 2022

arXiv:2203.17263 [pdf, other]

Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis

Authors: Karren Yang, Dejan Markovic, Steven Krenn, Vasu Agrawal, Alexander Richard

Abstract: Since facial actions such as lip movements contain significant information about speech content, it is not surprising that audio-visual speech enhancement methods are more accurate than their audio-only counterparts. Yet, state-of-the-art approaches still struggle to generate clean, realistic speech without noise artifacts and unnatural distortions in challenging acoustic environments. In this pap… ▽ More Since facial actions such as lip movements contain significant information about speech content, it is not surprising that audio-visual speech enhancement methods are more accurate than their audio-only counterparts. Yet, state-of-the-art approaches still struggle to generate clean, realistic speech without noise artifacts and unnatural distortions in challenging acoustic environments. In this paper, we propose a novel audio-visual speech enhancement framework for high-fidelity telecommunications in AR/VR. Our approach leverages audio-visual speech cues to generate the codes of a neural speech codec, enabling efficient synthesis of clean, realistic speech from noisy signals. Given the importance of speaker-specific cues in speech, we focus on developing personalized models that work well for individual speakers. We demonstrate the efficacy of our approach on a new audio-visual speech dataset collected in an unconstrained, large vocabulary setting, as well as existing audio-visual datasets, outperforming speech enhancement baselines on both quantitative metrics and human evaluation studies. Please see the supplemental video for qualitative results at https://github.com/facebookresearch/facestar/releases/download/paper_materials/video.mp4. △ Less

Submitted 31 March, 2022; originally announced March 2022.

arXiv:2111.04961 [pdf]

Convolutional Neural Networks with Radio-Frequency Spintronic Nano-Devices

Authors: Nathan Leroux, Arnaud De Riz, Dédalo Sanz-Hernández, Danijela Marković, Alice Mizrahi, Julie Grollier

Abstract: Convolutional neural networks are state-of-the-art and ubiquitous in modern signal processing and machine vision. Nowadays, hardware solutions based on emerging nanodevices are designed to reduce the power consumption of these networks. Spintronics devices are promising for information processing because of the various neural and synaptic functionalities they offer. However, due to their low OFF/O… ▽ More Convolutional neural networks are state-of-the-art and ubiquitous in modern signal processing and machine vision. Nowadays, hardware solutions based on emerging nanodevices are designed to reduce the power consumption of these networks. Spintronics devices are promising for information processing because of the various neural and synaptic functionalities they offer. However, due to their low OFF/ON ratio, performing all the multiplications required for convolutions in a single step with a crossbar array of spintronic memories would cause sneak-path currents. Here we present an architecture where synaptic communications have a frequency selectivity that prevents crosstalk caused by sneak-path currents. We first demonstrate how a chain of spintronic resonators can function as synapses and make convolutions by sequentially rectifying radio-frequency signals encoding consecutive sets of inputs. We show that a parallel implementation is possible with multiple chains of spintronic resonators to avoid storing intermediate computational steps in memory. We propose two different spatial arrangements for these chains. For each of them, we explain how to tune many artificial synapses simultaneously, exploiting the synaptic weight sharing specific to convolutions. We show how information can be transmitted between convolutional layers by using spintronic oscillators as artificial microwave neurons. Finally, we simulate a network of these radio-frequency resonators and spintronic oscillators to solve the MNIST handwritten digits dataset, and obtain results comparable to software convolutional neural networks. Since it can run convolutional neural networks fully in parallel in a single step with nano devices, the architecture proposed in this paper is promising for embedded applications requiring machine vision, such as autonomous driving. △ Less

Submitted 9 November, 2021; originally announced November 2021.

arXiv:2110.06737 [pdf, other]

doi 10.1103/PhysRevB.105.014411

Easy-plane spin Hall nano-oscillators as spiking neurons for neuromorphic computing

Authors: Danijela Marković, Matthew W. Daniels, Pankaj Sethi, Andrew D. Kent, Mark D. Stiles, Julie Grollier

Abstract: We show analytically using a macrospin approximation that easy-plane spin Hall nano-oscillators excited by a spin-current polarized perpendicularly to the easy-plane have phase dynamics analogous to that of Josephson junctions. Similarly to Josephson junctions, they can reproduce the spiking behavior of biological neurons that is appropriate for neuromorphic computing. We perform micromagnetic sim… ▽ More We show analytically using a macrospin approximation that easy-plane spin Hall nano-oscillators excited by a spin-current polarized perpendicularly to the easy-plane have phase dynamics analogous to that of Josephson junctions. Similarly to Josephson junctions, they can reproduce the spiking behavior of biological neurons that is appropriate for neuromorphic computing. We perform micromagnetic simulations of such oscillators realized in the nano-constriction geometry and show that the easy-plane spiking dynamics is preserved in an experimentally feasible architecture. Finally we simulate two elementary neural network blocks that implement operations essential for neuromorphic computing. First, we show that output spikes energies from two neurons can be summed and injected into a following layer neuron and second, we demonstrate that outputs can be multiplied by synaptic weights implemented by locally modifying the anisotropy. △ Less

Submitted 13 October, 2021; originally announced October 2021.

Comments: 9 pages, 11 figures

arXiv:2105.05956 [pdf]

doi 10.1088/2634-4386/ac4a83

2022 Roadmap on Neuromorphic Computing and Engineering

Authors: Dennis V. Christensen, Regina Dittmann, Bernabé Linares-Barranco, Abu Sebastian, Manuel Le Gallo, Andrea Redaelli, Stefan Slesazeck, Thomas Mikolajick, Sabina Spiga, Stephan Menzel, Ilia Valov, Gianluca Milano, Carlo Ricciardi, Shi-Jun Liang, Feng Miao, Mario Lanza, Tyler J. Quill, Scott T. Keene, Alberto Salleo, Julie Grollier, Danijela Marković, Alice Mizrahi, Peng Yao, J. Joshua Yang, Giacomo Indiveri , et al. (34 additional authors not shown)

Abstract: Modern computation based on the von Neumann architecture is today a mature cutting-edge science. In the Von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exas… ▽ More Modern computation based on the von Neumann architecture is today a mature cutting-edge science. In the Von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exascale with 1018 calculations each second. Even though these future computers will be incredibly powerful, if they are based on von Neumann type architectures, they will consume between 20 and 30 megawatts of power and will not have intrinsic physically built-in capabilities to learn or deal with complex data as our brain does. These needs can be addressed by neuromorphic computing systems which are inspired by the biological concepts of the human brain. This new generation of computers has the potential to be used for the storage and processing of large amounts of digital information with much lower power consumption than conventional processors. Among their potential future applications, an important niche is moving the control from data centers to edge devices. The aim of this Roadmap is to present a snapshot of the present state of neuromorphic technology and provide an opinion on the challenges and opportunities that the future holds in the major areas of neuromorphic technology, namely materials, devices, neuromorphic circuits, neuromorphic algorithms, applications, and ethics. The Roadmap is a collection of perspectives where leading researchers in the neuromorphic community provide their own view about the current state and the future challenges. We hope that this Roadmap will be a useful resource to readers outside this field, for those who are just entering the field, and for those who are well established in the neuromorphic community. https://doi.org/10.1088/2634-4386/ac4a83 △ Less

Submitted 13 January, 2022; v1 submitted 12 May, 2021; originally announced May 2021.

Journal ref: Neuromorph. Comput. Eng. 2 022501 (2022)

arXiv:2101.08699 [pdf, other]

doi 10.1016/j.neunet.2021.08.018

An empirical evaluation of active inference in multi-armed bandits

Authors: Dimitrije Markovic, Hrvoje Stojic, Sarah Schwoebel, Stefan J. Kiebel

Abstract: A key feature of sequential decision making under uncertainty is a need to balance between exploiting--choosing the best action according to the current knowledge, and exploring--obtaining information about values of other actions. The multi-armed bandit problem, a classical task that captures this trade-off, served as a vehicle in machine learning for developing bandit algorithms that proved to b… ▽ More A key feature of sequential decision making under uncertainty is a need to balance between exploiting--choosing the best action according to the current knowledge, and exploring--obtaining information about values of other actions. The multi-armed bandit problem, a classical task that captures this trade-off, served as a vehicle in machine learning for developing bandit algorithms that proved to be useful in numerous industrial applications. The active inference framework, an approach to sequential decision making recently developed in neuroscience for understanding human and animal behaviour, is distinguished by its sophisticated strategy for resolving the exploration-exploitation trade-off. This makes active inference an exciting alternative to already established bandit algorithms. Here we derive an efficient and scalable approximate active inference algorithm and compare it to two state-of-the-art bandit algorithms: Bayesian upper confidence bound and optimistic Thompson sampling. This comparison is done on two types of bandit problems: a stationary and a dynamic switching bandit. Our empirical evaluation shows that the active inference algorithm does not produce efficient long-term behaviour in stationary bandits. However, in the more challenging switching bandit problem active inference performs substantially better than the two state-of-the-art bandit algorithms. The results open exciting venues for further research in theoretical and applied machine learning, as well as lend additional credibility to active inference as a general framework for studying human and animal behaviour. △ Less

Submitted 4 August, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

arXiv:2004.00930 [pdf, ps, other]

Neuronal Sequence Models for Bayesian Online Inference

Authors: Sascha Frölich, Dimitrije Marković, Stefan J. Kiebel

Abstract: Sequential neuronal activity underlies a wide range of processes in the brain. Neuroscientific evidence for neuronal sequences has been reported in domains as diverse as perception, motor control, speech, spatial navigation and memory. Consequently, different dynamical principles have been proposed as possible sequence-generating mechanisms. Combining experimental findings with computational conce… ▽ More Sequential neuronal activity underlies a wide range of processes in the brain. Neuroscientific evidence for neuronal sequences has been reported in domains as diverse as perception, motor control, speech, spatial navigation and memory. Consequently, different dynamical principles have been proposed as possible sequence-generating mechanisms. Combining experimental findings with computational concepts like the Bayesian brain hypothesis and predictive coding leads to the interesting possibility that predictive and inferential processes in the brain are grounded on generative processes which maintain a sequential structure. While probabilistic inference about ongoing sequences is a useful computational model for both the analysis of neuroscientific data and a wide range of problems in artificial recognition and motor control, research on the subject is relatively scarce and distributed over different fields in the neurosciences. Here we review key findings about neuronal sequences and relate these to the concept of online inference on sequences as a model of sensory-motor processing and recognition. We propose that describing sequential neuronal activity as an expression of probabilistic inference over sequences may lead to novel perspectives on brain function. Importantly, it is promising to translate the key idea of probabilistic inference on sequences to machine learning, in order to address challenges in the real-time recognition of speech and human motion. △ Less

Submitted 2 April, 2020; originally announced April 2020.

arXiv:2003.04711 [pdf]

Physics for Neuromorphic Computing

Authors: Danijela Markovic, Alice Mizrahi, Damien Querlioz, Julie Grollier

Abstract: Neuromorphic computing takes inspiration from the brain to create energy efficient hardware for information processing, capable of highly sophisticated tasks. In this article, we make the case that building this new hardware necessitates reinventing electronics. We show that research in physics and material science will be key to create artificial nano-neurons and synapses, to connect them togethe… ▽ More Neuromorphic computing takes inspiration from the brain to create energy efficient hardware for information processing, capable of highly sophisticated tasks. In this article, we make the case that building this new hardware necessitates reinventing electronics. We show that research in physics and material science will be key to create artificial nano-neurons and synapses, to connect them together in huge numbers, to organize them in complex systems, and to compute with them efficiently. We describe how some researchers choose to take inspiration from artificial intelligence to move forward in this direction, whereas others prefer taking inspiration from neuroscience, and we highlight recent striking results obtained with these two approaches. Finally, we discuss the challenges and perspectives in neuromorphic physics, which include developing the algorithms and the hardware hand in hand, making significant advances with small toy systems, as well as building large scale networks. △ Less

Submitted 8 March, 2020; originally announced March 2020.

arXiv:1610.08450 [pdf, other]

Body movement to sound interface with vector autoregressive hierarchical hidden Markov models

Authors: Dimitrije Marković, Borjana Valčić, Nebojša Malešević

Abstract: Interfacing a kinetic action of a person to an action of a machine system is an important research topic in many application areas. One of the key factors for intimate human-machine interaction is the ability of the control algorithm to detect and classify different user commands with shortest possible latency, thus making a highly correlated link between cause and effect. In our research, we focu… ▽ More Interfacing a kinetic action of a person to an action of a machine system is an important research topic in many application areas. One of the key factors for intimate human-machine interaction is the ability of the control algorithm to detect and classify different user commands with shortest possible latency, thus making a highly correlated link between cause and effect. In our research, we focused on the task of mapping user kinematic actions into sound samples. The presented methodology relies on the wireless sensor nodes equipped with inertial measurement units and the real-time algorithm dedicated for early detection and classification of a variety of movements/gestures performed by a user. The core algorithm is based on the approximate Bayesian inference of Vector Autoregressive Hierarchical Hidden Markov Models (VAR-HHMM), where models database is derived from the set of motion gestures. The performance of the algorithm was compared with an online version of the K-nearest neighbours (KNN) algorithm, where we used offline expert based classification as the benchmark. In almost all of the evaluation metrics (e.g. confusion matrix, recall and precision scores) the VAR-HHMM algorithm outperformed KNN. Furthermore, the VAR-HHMM algorithm, in some cases, achieved faster movement onset detection compared with the offline standard. The proposed concept, although envisioned for movement-to-sound application, could be implemented in other human-machine interfaces. △ Less

Submitted 26 October, 2016; originally announced October 2016.

Comments: 12 pages, 7 figures, a pre-submission draft

ACM Class: F.1.2; G.3; H.1.2; H.5.1; H.5.2; I.1.4; I.2.9

arXiv:1111.6849 [pdf, ps, other]

doi 10.1140/epjb/e2011-20581-3

Neuropsychological constraints to human data production on a global scale

Authors: Claudius Gros, Gregor Kaczor, Dimitrije Markovic

Abstract: Which are the factors underlying human information production on a global level? In order to gain an insight into this question we study a corpus of 252-633 Million publicly available data files on the Internet corresponding to an overall storage volume of 284-675 Terabytes. Analyzing the file size distribution for several distinct data types we find indications that the neuropsychological capacit… ▽ More Which are the factors underlying human information production on a global level? In order to gain an insight into this question we study a corpus of 252-633 Million publicly available data files on the Internet corresponding to an overall storage volume of 284-675 Terabytes. Analyzing the file size distribution for several distinct data types we find indications that the neuropsychological capacity of the human brain to process and record information may constitute the dominant limiting factor for the overall growth of globally stored information, with real-world economic constraints having only a negligible influence. This supposition draws support from the observation that the files size distributions follow a power law for data without a time component, like images, and a log-normal distribution for multimedia files, for which time is a defining qualia. △ Less

Submitted 27 November, 2011; originally announced November 2011.

Comments: to be published in: European Physical Journal B

Journal ref: European Physical Journal B, 85: 28 (2012)

Showing 1–24 of 24 results for author: Marković, D