-
Gradient-free variational learning with conditional mixture networks
Authors:
Conor Heins,
Hao Wu,
Dimitrije Markovic,
Alexander Tschantz,
Jeff Beck,
Christopher Buckley
Abstract:
Balancing computational efficiency with robust predictive performance is crucial in supervised learning, especially for critical applications. Standard deep learning models, while accurate and scalable, often lack probabilistic features like calibrated predictions and uncertainty quantification. Bayesian methods address these issues but can be computationally expensive as model and data complexity…
▽ More
Balancing computational efficiency with robust predictive performance is crucial in supervised learning, especially for critical applications. Standard deep learning models, while accurate and scalable, often lack probabilistic features like calibrated predictions and uncertainty quantification. Bayesian methods address these issues but can be computationally expensive as model and data complexity increase. Previous work shows that fast variational methods can reduce the compute requirements of Bayesian methods by eliminating the need for gradient computation or sampling, but are often limited to simple models. We demonstrate that conditional mixture networks (CMNs), a probabilistic variant of the mixture-of-experts (MoE) model, are suitable for fast, gradient-free inference and can solve complex classification tasks. CMNs employ linear experts and a softmax gating network. By exploiting conditional conjugacy and Pólya-Gamma augmentation, we furnish Gaussian likelihoods for the weights of both the linear experts and the gating network. This enables efficient variational updates using coordinate ascent variational inference (CAVI), avoiding traditional gradient-based optimization. We validate this approach by training two-layer CMNs on standard benchmarks from the UCI repository. Our method, CAVI-CMN, achieves competitive and often superior predictive accuracy compared to maximum likelihood estimation (MLE) with backpropagation, while maintaining competitive runtime and full posterior distributions over all model parameters. Moreover, as input size or the number of experts increases, computation time scales competitively with MLE and other gradient-based solutions like black-box variational inference (BBVI), making CAVI-CMN a promising tool for deep, fast, and gradient-free Bayesian networks.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
From pixels to planning: scale-free active inference
Authors:
Karl Friston,
Conor Heins,
Tim Verbelen,
Lancelot Da Costa,
Tommaso Salvatori,
Dimitrije Markovic,
Alexander Tschantz,
Magnus Koudahl,
Christopher Buckley,
Thomas Parr
Abstract:
This paper describes a discrete state-space model -- and accompanying methods -- for generative modelling. This model generalises partially observed Markov decision processes to include paths as latent variables, rendering it suitable for active inference and learning in a dynamic setting. Specifically, we consider deep or hierarchical forms using the renormalisation group. The ensuing renormalisi…
▽ More
This paper describes a discrete state-space model -- and accompanying methods -- for generative modelling. This model generalises partially observed Markov decision processes to include paths as latent variables, rendering it suitable for active inference and learning in a dynamic setting. Specifically, we consider deep or hierarchical forms using the renormalisation group. The ensuing renormalising generative models (RGM) can be regarded as discrete homologues of deep convolutional neural networks or continuous state-space models in generalised coordinates of motion. By construction, these scale-invariant models can be used to learn compositionality over space and time, furnishing models of paths or orbits; i.e., events of increasing temporal depth and itinerancy. This technical note illustrates the automatic discovery, learning and deployment of RGMs using a series of applications. We start with image classification and then consider the compression and generation of movies and music. Finally, we apply the same variational principles to the learning of Atari-like games.
△ Less
Submitted 27 July, 2024;
originally announced July 2024.
-
Modeling and Driving Human Body Soundfields through Acoustic Primitives
Authors:
Chao Huang,
Dejan Markovic,
Chenliang Xu,
Alexander Richard
Abstract:
While rendering and animation of photorealistic 3D human body models have matured and reached an impressive quality over the past years, modeling the spatial audio associated with such full body models has been largely ignored so far. In this work, we present a framework that allows for high-quality spatial audio generation, capable of rendering the full 3D soundfield generated by a human body, in…
▽ More
While rendering and animation of photorealistic 3D human body models have matured and reached an impressive quality over the past years, modeling the spatial audio associated with such full body models has been largely ignored so far. In this work, we present a framework that allows for high-quality spatial audio generation, capable of rendering the full 3D soundfield generated by a human body, including speech, footsteps, hand-body interactions, and others. Given a basic audio-visual representation of the body in form of 3D body pose and audio from a head-mounted microphone, we demonstrate that we can render the full acoustic scene at any point in 3D space efficiently and accurately. To enable near-field and realtime rendering of sound, we borrow the idea of volumetric primitives from graphical neural rendering and transfer them into the acoustic domain. Our acoustic primitives result in an order of magnitude smaller soundfield representations and overcome deficiencies in near-field rendering compared to previous approaches.
△ Less
Submitted 20 July, 2024; v1 submitted 17 July, 2024;
originally announced July 2024.
-
Training of Physical Neural Networks
Authors:
Ali Momeni,
Babak Rahmani,
Benjamin Scellier,
Logan G. Wright,
Peter L. McMahon,
Clara C. Wanjura,
Yuhang Li,
Anas Skalli,
Natalia G. Berloff,
Tatsuhiro Onodera,
Ilker Oguz,
Francesco Morichetti,
Philipp del Hougne,
Manuel Le Gallo,
Abu Sebastian,
Azalia Mirhoseini,
Cheng Zhang,
Danijela Marković,
Daniel Brunner,
Christophe Moser,
Sylvain Gigan,
Florian Marquardt,
Aydogan Ozcan,
Julie Grollier,
Andrea J. Liu
, et al. (3 additional authors not shown)
Abstract:
Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also…
▽ More
Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also have them perform inference locally and privately on edge devices, such as smartphones or sensors? Research over the past few years has shown that the answer to all these questions is likely "yes, with enough research": PNNs could one day radically change what is possible and practical for AI systems. To do this will however require rethinking both how AI models work, and how they are trained - primarily by considering the problems through the constraints of the underlying hardware physics. To train PNNs at large scale, many methods including backpropagation-based and backpropagation-free approaches are now being explored. These methods have various trade-offs, and so far no method has been shown to scale to the same scale and performance as the backpropagation algorithm widely used in deep learning today. However, this is rapidly changing, and a diverse ecosystem of training techniques provides clues for how PNNs may one day be utilized to create both more efficient realizations of current-scale AI models, and to enable unprecedented-scale models.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Reframing the Expected Free Energy: Four Formulations and a Unification
Authors:
Théophile Champion,
Howard Bowman,
Dimitrije Marković,
Marek Grześ
Abstract:
Active inference is a leading theory of perception, learning and decision making, which can be applied to neuroscience, robotics, psychology, and machine learning. Active inference is based on the expected free energy, which is mostly justified by the intuitive plausibility of its formulations, e.g., the risk plus ambiguity and information gain / pragmatic value formulations. This paper seek to fo…
▽ More
Active inference is a leading theory of perception, learning and decision making, which can be applied to neuroscience, robotics, psychology, and machine learning. Active inference is based on the expected free energy, which is mostly justified by the intuitive plausibility of its formulations, e.g., the risk plus ambiguity and information gain / pragmatic value formulations. This paper seek to formalize the problem of deriving these formulations from a single root expected free energy definition, i.e., the unification problem. Then, we study two settings, each one having its own root expected free energy definition. In the first setting, no justification for the expected free energy has been proposed to date, but all the formulations can be recovered from it. However, in this setting, the agent cannot have arbitrary prior preferences over observations. Indeed, only a limited class of prior preferences over observations is compatible with the likelihood mapping of the generative model. In the second setting, a justification of the root expected free energy definition is known, but this setting only accounts for two formulations, i.e., the risk over states plus ambiguity and entropy plus expected energy formulations.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
An epistemic logic for modeling decisions in the context of incomplete knowledge
Authors:
Đorđe Marković,
Simon Vandevelde,
Linde Vanbesien,
Joost Vennekens,
Marc Denecker
Abstract:
Substantial efforts have been made in developing various Decision Modeling formalisms, both from industry and academia. A challenging problem is that of expressing decision knowledge in the context of incomplete knowledge. In such contexts, decisions depend on what is known or not known. We argue that none of the existing formalisms for modeling decisions are capable of correctly capturing the epi…
▽ More
Substantial efforts have been made in developing various Decision Modeling formalisms, both from industry and academia. A challenging problem is that of expressing decision knowledge in the context of incomplete knowledge. In such contexts, decisions depend on what is known or not known. We argue that none of the existing formalisms for modeling decisions are capable of correctly capturing the epistemic nature of such decisions, inevitably causing issues in situations of uncertainty. This paper presents a new language for modeling decisions with incomplete knowledge. It combines three principles: stratification, autoepistemic logic, and definitions. A knowledge base in this language is a hierarchy of epistemic theories, where each component theory may epistemically reason on the knowledge in lower theories, and decisions are made using definitions with epistemic conditions.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Supervised structure learning
Authors:
Karl J. Friston,
Lancelot Da Costa,
Alexander Tschantz,
Alex Kiefer,
Tommaso Salvatori,
Victorita Neacsu,
Magnus Koudahl,
Conor Heins,
Noor Sajid,
Dimitrije Markovic,
Thomas Parr,
Tim Verbelen,
Christopher L Buckley
Abstract:
This paper concerns structure learning or discovery of discrete generative models. It focuses on Bayesian model selection and the assimilation of training data or content, with a special emphasis on the order in which data are ingested. A key move - in the ensuing schemes - is to place priors on the selection of models, based upon expected free energy. In this setting, expected free energy reduces…
▽ More
This paper concerns structure learning or discovery of discrete generative models. It focuses on Bayesian model selection and the assimilation of training data or content, with a special emphasis on the order in which data are ingested. A key move - in the ensuing schemes - is to place priors on the selection of models, based upon expected free energy. In this setting, expected free energy reduces to a constrained mutual information, where the constraints inherit from priors over outcomes (i.e., preferred outcomes). The resulting scheme is first used to perform image classification on the MNIST dataset to illustrate the basic idea, and then tested on a more challenging problem of discovering models with dynamics, using a simple sprite-based visual disentanglement paradigm and the Tower of Hanoi (cf., blocks world) problem. In these examples, generative models are constructed autodidactically to recover (i.e., disentangle) the factorial structure of latent states - and their characteristic paths or dynamics.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
Authors:
Xudong Xu,
Dejan Markovic,
Jacob Sandakly,
Todd Keebler,
Steven Krenn,
Alexander Richard
Abstract:
While 3D human body modeling has received much attention in computer vision, modeling the acoustic equivalent, i.e. modeling 3D spatial audio produced by body motion and speech, has fallen short in the community. To close this gap, we present a model that can generate accurate 3D spatial audio for full human bodies. The system consumes, as input, audio signals from headset microphones and body pos…
▽ More
While 3D human body modeling has received much attention in computer vision, modeling the acoustic equivalent, i.e. modeling 3D spatial audio produced by body motion and speech, has fallen short in the community. To close this gap, we present a model that can generate accurate 3D spatial audio for full human bodies. The system consumes, as input, audio signals from headset microphones and body pose, and produces, as output, a 3D sound field surrounding the transmitter's body, from which spatial audio can be rendered at any arbitrary position in the 3D space. We collect a first-of-its-kind multimodal dataset of human bodies, recorded with multiple cameras and a spherical array of 345 microphones. In an empirical evaluation, we demonstrate that our model can produce accurate body-induced sound fields when trained with a suitable loss. Dataset and code are available online.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
Bayesian sparsification for deep neural networks with Bayesian model reduction
Authors:
Dimitrije Marković,
Karl J. Friston,
Stefan J. Kiebel
Abstract:
Deep learning's immense capabilities are often constrained by the complexity of its models, leading to an increasing demand for effective sparsification techniques. Bayesian sparsification for deep learning emerges as a crucial approach, facilitating the design of models that are both computationally efficient and competitive in terms of performance across various deep learning applications. The s…
▽ More
Deep learning's immense capabilities are often constrained by the complexity of its models, leading to an increasing demand for effective sparsification techniques. Bayesian sparsification for deep learning emerges as a crucial approach, facilitating the design of models that are both computationally efficient and competitive in terms of performance across various deep learning applications. The state-of-the-art -- in Bayesian sparsification of deep neural networks -- combines structural shrinkage priors on model weights with an approximate inference scheme based on stochastic variational inference. However, model inversion of the full generative model is exceptionally computationally demanding, especially when compared to standard deep learning of point estimates. In this context, we advocate for the use of Bayesian model reduction (BMR) as a more efficient alternative for pruning of model weights. As a generalization of the Savage-Dickey ratio, BMR allows a post-hoc elimination of redundant model weights based on the posterior estimates under a straightforward (non-hierarchical) generative model. Our comparative study highlights the advantages of the BMR method relative to established approaches based on hierarchical horseshoe priors over model weights. We illustrate the potential of BMR across various deep learning architectures, from classical networks like LeNet to modern frameworks such as Vision Transformers and MLP-Mixers.
△ Less
Submitted 27 October, 2023; v1 submitted 21 September, 2023;
originally announced September 2023.
-
Training an Ising Machine with Equilibrium Propagation
Authors:
Jérémie Laydevant,
Danijela Markovic,
Julie Grollier
Abstract:
Ising machines, which are hardware implementations of the Ising model of coupled spins, have been influential in the development of unsupervised learning algorithms at the origins of Artificial Intelligence (AI). However, their application to AI has been limited due to the complexities in matching supervised training methods with Ising machine physics, even though these methods are essential for a…
▽ More
Ising machines, which are hardware implementations of the Ising model of coupled spins, have been influential in the development of unsupervised learning algorithms at the origins of Artificial Intelligence (AI). However, their application to AI has been limited due to the complexities in matching supervised training methods with Ising machine physics, even though these methods are essential for achieving high accuracy. In this study, we demonstrate a novel approach to train Ising machines in a supervised way through the Equilibrium Propagation algorithm, achieving comparable results to software-based implementations. We employ the quantum annealing procedure of the D-Wave Ising machine to train a fully-connected neural network on the MNIST dataset. Furthermore, we demonstrate that the machine's connectivity supports convolution operations, enabling the training of a compact convolutional network with minimal spins per neuron. Our findings establish Ising machines as a promising trainable hardware platform for AI, with the potential to enhance machine learning applications.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
Multilayer spintronic neural networks with radio-frequency connections
Authors:
Andrew Ross,
Nathan Leroux,
Arnaud de Riz,
Danijela Marković,
Dédalo Sanz-Hernández,
Juan Trastoy,
Paolo Bortolotti,
Damien Querlioz,
Leandro Martins,
Luana Benetti,
Marcel S. Claro,
Pedro Anacleto,
Alejandro Schulman,
Thierry Taris,
Jean-Baptiste Begueret,
Sylvain Saïghi,
Alex S. Jenkins,
Ricardo Ferreira,
Adrien F. Vincent,
Alice Mizrahi,
Julie Grollier
Abstract:
Spintronic nano-synapses and nano-neurons perform complex cognitive computations with high accuracy thanks to their rich, reproducible and controllable magnetization dynamics. These dynamical nanodevices could transform artificial intelligence hardware, provided that they implement state-of-the art deep neural networks. However, there is today no scalable way to connect them in multilayers. Here w…
▽ More
Spintronic nano-synapses and nano-neurons perform complex cognitive computations with high accuracy thanks to their rich, reproducible and controllable magnetization dynamics. These dynamical nanodevices could transform artificial intelligence hardware, provided that they implement state-of-the art deep neural networks. However, there is today no scalable way to connect them in multilayers. Here we show that the flagship nano-components of spintronics, magnetic tunnel junctions, can be connected into multilayer neural networks where they implement both synapses and neurons thanks to their magnetization dynamics, and communicate by processing, transmitting and receiving radio frequency (RF) signals. We build a hardware spintronic neural network composed of nine magnetic tunnel junctions connected in two layers, and show that it natively classifies nonlinearly-separable RF inputs with an accuracy of 97.7%. Using physical simulations, we demonstrate that a large network of nanoscale junctions can achieve state-of the-art identification of drones from their RF transmissions, without digitization, and consuming only a few milliwatts, which is a gain of more than four orders of magnitude in power consumption compared to currently used techniques. This study lays the foundation for deep, dynamical, spintronic neural networks.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Classification of multi-frequency RF signals by extreme learning, using magnetic tunnel junctions as neurons and synapses
Authors:
Nathan Leroux,
Danijela Marković,
Dédalo Sanz-Hernández,
Juan Trastoy,
Paolo Bortolotti,
Alejandro Schulman,
Luana Benetti,
Alex Jenkins,
Ricardo Ferreira,
Julie Grollier,
Alice Mizrahi
Abstract:
Extracting information from radiofrequency (RF) signals using artificial neural networks at low energy cost is a critical need for a wide range of applications from radars to health. These RF inputs are composed of multiples frequencies. Here we show that magnetic tunnel junctions can process analogue RF inputs with multiple frequencies in parallel and perform synaptic operations. Using a backprop…
▽ More
Extracting information from radiofrequency (RF) signals using artificial neural networks at low energy cost is a critical need for a wide range of applications from radars to health. These RF inputs are composed of multiples frequencies. Here we show that magnetic tunnel junctions can process analogue RF inputs with multiple frequencies in parallel and perform synaptic operations. Using a backpropagation-free method called extreme learning, we classify noisy images encoded by RF signals, using experimental data from magnetic tunnel junctions functioning as both synapses and neurons. We achieve the same accuracy as an equivalent software neural network. These results are a key step for embedded radiofrequency artificial intelligence.
△ Less
Submitted 20 April, 2023; v1 submitted 2 November, 2022;
originally announced November 2022.
-
Reconstructing the Dynamic Directivity of Unconstrained Speech
Authors:
Camille Noufi,
Dejan Markovic,
Peter Dodds
Abstract:
This article presents a method for estimating and reconstructing the spatial energy distribution pattern of natural speech, which is crucial for achieving realistic vocal presence in virtual communication settings. The method comprises two stages. First, recordings of speech captured by a real, static microphone array are used to create an egocentric virtual array that tracks the movement of the s…
▽ More
This article presents a method for estimating and reconstructing the spatial energy distribution pattern of natural speech, which is crucial for achieving realistic vocal presence in virtual communication settings. The method comprises two stages. First, recordings of speech captured by a real, static microphone array are used to create an egocentric virtual array that tracks the movement of the speaker over time. This virtual array is used to measure and encode the high-resolution directivity pattern of the speech signal as it evolves dynamically with natural speech and movement. In the second stage, the encoded directivity representation is utilized to train a machine learning model that can estimate the full, dynamic directivity pattern given a limited set of speech signals, such as those recorded using the microphones on a head-mounted display. Our results show that neural networks can accurately estimate the full directivity pattern of natural, unconstrained speech from limited information. The proposed method for estimating and reconstructing the spatial energy distribution pattern of natural speech, along with the evaluation of various machine learning models and training paradigms, provides an important contribution to the development of realistic vocal presence in virtual communication settings.
△ Less
Submitted 5 September, 2023; v1 submitted 9 September, 2022;
originally announced September 2022.
-
End-to-End Binaural Speech Synthesis
Authors:
Wen Chin Huang,
Dejan Markovic,
Alexander Richard,
Israel Dejene Gebru,
Anjali Menon
Abstract:
In this work, we present an end-to-end binaural speech synthesis system that combines a low-bitrate audio codec with a powerful binaural decoder that is capable of accurate speech binauralization while faithfully reconstructing environmental factors like ambient noise or reverb. The network is a modified vector-quantized variational autoencoder, trained with several carefully designed objectives,…
▽ More
In this work, we present an end-to-end binaural speech synthesis system that combines a low-bitrate audio codec with a powerful binaural decoder that is capable of accurate speech binauralization while faithfully reconstructing environmental factors like ambient noise or reverb. The network is a modified vector-quantized variational autoencoder, trained with several carefully designed objectives, including an adversarial loss. We evaluate the proposed system on an internal binaural dataset with objective metrics and a perceptual study. Results show that the proposed approach matches the ground truth data more closely than previous methods. In particular, we demonstrate the capability of the adversarial loss in capturing environment effects needed to create an authentic auditory scene.
△ Less
Submitted 8 July, 2022;
originally announced July 2022.
-
Implicit Neural Spatial Filtering for Multichannel Source Separation in the Waveform Domain
Authors:
Dejan Markovic,
Alexandre Defossez,
Alexander Richard
Abstract:
We present a single-stage casual waveform-to-waveform multichannel model that can separate moving sound sources based on their broad spatial locations in a dynamic acoustic scene. We divide the scene into two spatial regions containing, respectively, the target and the interfering sound sources. The model is trained end-to-end and performs spatial processing implicitly, without any components base…
▽ More
We present a single-stage casual waveform-to-waveform multichannel model that can separate moving sound sources based on their broad spatial locations in a dynamic acoustic scene. We divide the scene into two spatial regions containing, respectively, the target and the interfering sound sources. The model is trained end-to-end and performs spatial processing implicitly, without any components based on traditional processing or use of hand-crafted spatial features. We evaluate the proposed model on a real-world dataset and show that the model matches the performance of an oracle beamformer followed by a state-of-the-art single-channel enhancement network.
△ Less
Submitted 30 June, 2022;
originally announced June 2022.
-
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Authors:
Karren Yang,
Dejan Markovic,
Steven Krenn,
Vasu Agrawal,
Alexander Richard
Abstract:
Since facial actions such as lip movements contain significant information about speech content, it is not surprising that audio-visual speech enhancement methods are more accurate than their audio-only counterparts. Yet, state-of-the-art approaches still struggle to generate clean, realistic speech without noise artifacts and unnatural distortions in challenging acoustic environments. In this pap…
▽ More
Since facial actions such as lip movements contain significant information about speech content, it is not surprising that audio-visual speech enhancement methods are more accurate than their audio-only counterparts. Yet, state-of-the-art approaches still struggle to generate clean, realistic speech without noise artifacts and unnatural distortions in challenging acoustic environments. In this paper, we propose a novel audio-visual speech enhancement framework for high-fidelity telecommunications in AR/VR. Our approach leverages audio-visual speech cues to generate the codes of a neural speech codec, enabling efficient synthesis of clean, realistic speech from noisy signals. Given the importance of speaker-specific cues in speech, we focus on developing personalized models that work well for individual speakers. We demonstrate the efficacy of our approach on a new audio-visual speech dataset collected in an unconstrained, large vocabulary setting, as well as existing audio-visual datasets, outperforming speech enhancement baselines on both quantitative metrics and human evaluation studies. Please see the supplemental video for qualitative results at https://github.com/facebookresearch/facestar/releases/download/paper_materials/video.mp4.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
Convolutional Neural Networks with Radio-Frequency Spintronic Nano-Devices
Authors:
Nathan Leroux,
Arnaud De Riz,
Dédalo Sanz-Hernández,
Danijela Marković,
Alice Mizrahi,
Julie Grollier
Abstract:
Convolutional neural networks are state-of-the-art and ubiquitous in modern signal processing and machine vision. Nowadays, hardware solutions based on emerging nanodevices are designed to reduce the power consumption of these networks. Spintronics devices are promising for information processing because of the various neural and synaptic functionalities they offer. However, due to their low OFF/O…
▽ More
Convolutional neural networks are state-of-the-art and ubiquitous in modern signal processing and machine vision. Nowadays, hardware solutions based on emerging nanodevices are designed to reduce the power consumption of these networks. Spintronics devices are promising for information processing because of the various neural and synaptic functionalities they offer. However, due to their low OFF/ON ratio, performing all the multiplications required for convolutions in a single step with a crossbar array of spintronic memories would cause sneak-path currents. Here we present an architecture where synaptic communications have a frequency selectivity that prevents crosstalk caused by sneak-path currents. We first demonstrate how a chain of spintronic resonators can function as synapses and make convolutions by sequentially rectifying radio-frequency signals encoding consecutive sets of inputs. We show that a parallel implementation is possible with multiple chains of spintronic resonators to avoid storing intermediate computational steps in memory. We propose two different spatial arrangements for these chains. For each of them, we explain how to tune many artificial synapses simultaneously, exploiting the synaptic weight sharing specific to convolutions. We show how information can be transmitted between convolutional layers by using spintronic oscillators as artificial microwave neurons. Finally, we simulate a network of these radio-frequency resonators and spintronic oscillators to solve the MNIST handwritten digits dataset, and obtain results comparable to software convolutional neural networks. Since it can run convolutional neural networks fully in parallel in a single step with nano devices, the architecture proposed in this paper is promising for embedded applications requiring machine vision, such as autonomous driving.
△ Less
Submitted 9 November, 2021;
originally announced November 2021.
-
Easy-plane spin Hall nano-oscillators as spiking neurons for neuromorphic computing
Authors:
Danijela Marković,
Matthew W. Daniels,
Pankaj Sethi,
Andrew D. Kent,
Mark D. Stiles,
Julie Grollier
Abstract:
We show analytically using a macrospin approximation that easy-plane spin Hall nano-oscillators excited by a spin-current polarized perpendicularly to the easy-plane have phase dynamics analogous to that of Josephson junctions. Similarly to Josephson junctions, they can reproduce the spiking behavior of biological neurons that is appropriate for neuromorphic computing. We perform micromagnetic sim…
▽ More
We show analytically using a macrospin approximation that easy-plane spin Hall nano-oscillators excited by a spin-current polarized perpendicularly to the easy-plane have phase dynamics analogous to that of Josephson junctions. Similarly to Josephson junctions, they can reproduce the spiking behavior of biological neurons that is appropriate for neuromorphic computing. We perform micromagnetic simulations of such oscillators realized in the nano-constriction geometry and show that the easy-plane spiking dynamics is preserved in an experimentally feasible architecture. Finally we simulate two elementary neural network blocks that implement operations essential for neuromorphic computing. First, we show that output spikes energies from two neurons can be summed and injected into a following layer neuron and second, we demonstrate that outputs can be multiplied by synaptic weights implemented by locally modifying the anisotropy.
△ Less
Submitted 13 October, 2021;
originally announced October 2021.
-
2022 Roadmap on Neuromorphic Computing and Engineering
Authors:
Dennis V. Christensen,
Regina Dittmann,
Bernabé Linares-Barranco,
Abu Sebastian,
Manuel Le Gallo,
Andrea Redaelli,
Stefan Slesazeck,
Thomas Mikolajick,
Sabina Spiga,
Stephan Menzel,
Ilia Valov,
Gianluca Milano,
Carlo Ricciardi,
Shi-Jun Liang,
Feng Miao,
Mario Lanza,
Tyler J. Quill,
Scott T. Keene,
Alberto Salleo,
Julie Grollier,
Danijela Marković,
Alice Mizrahi,
Peng Yao,
J. Joshua Yang,
Giacomo Indiveri
, et al. (34 additional authors not shown)
Abstract:
Modern computation based on the von Neumann architecture is today a mature cutting-edge science. In the Von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exas…
▽ More
Modern computation based on the von Neumann architecture is today a mature cutting-edge science. In the Von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exascale with 1018 calculations each second. Even though these future computers will be incredibly powerful, if they are based on von Neumann type architectures, they will consume between 20 and 30 megawatts of power and will not have intrinsic physically built-in capabilities to learn or deal with complex data as our brain does. These needs can be addressed by neuromorphic computing systems which are inspired by the biological concepts of the human brain. This new generation of computers has the potential to be used for the storage and processing of large amounts of digital information with much lower power consumption than conventional processors. Among their potential future applications, an important niche is moving the control from data centers to edge devices.
The aim of this Roadmap is to present a snapshot of the present state of neuromorphic technology and provide an opinion on the challenges and opportunities that the future holds in the major areas of neuromorphic technology, namely materials, devices, neuromorphic circuits, neuromorphic algorithms, applications, and ethics. The Roadmap is a collection of perspectives where leading researchers in the neuromorphic community provide their own view about the current state and the future challenges. We hope that this Roadmap will be a useful resource to readers outside this field, for those who are just entering the field, and for those who are well established in the neuromorphic community.
https://doi.org/10.1088/2634-4386/ac4a83
△ Less
Submitted 13 January, 2022; v1 submitted 12 May, 2021;
originally announced May 2021.
-
An empirical evaluation of active inference in multi-armed bandits
Authors:
Dimitrije Markovic,
Hrvoje Stojic,
Sarah Schwoebel,
Stefan J. Kiebel
Abstract:
A key feature of sequential decision making under uncertainty is a need to balance between exploiting--choosing the best action according to the current knowledge, and exploring--obtaining information about values of other actions. The multi-armed bandit problem, a classical task that captures this trade-off, served as a vehicle in machine learning for developing bandit algorithms that proved to b…
▽ More
A key feature of sequential decision making under uncertainty is a need to balance between exploiting--choosing the best action according to the current knowledge, and exploring--obtaining information about values of other actions. The multi-armed bandit problem, a classical task that captures this trade-off, served as a vehicle in machine learning for developing bandit algorithms that proved to be useful in numerous industrial applications. The active inference framework, an approach to sequential decision making recently developed in neuroscience for understanding human and animal behaviour, is distinguished by its sophisticated strategy for resolving the exploration-exploitation trade-off. This makes active inference an exciting alternative to already established bandit algorithms. Here we derive an efficient and scalable approximate active inference algorithm and compare it to two state-of-the-art bandit algorithms: Bayesian upper confidence bound and optimistic Thompson sampling. This comparison is done on two types of bandit problems: a stationary and a dynamic switching bandit. Our empirical evaluation shows that the active inference algorithm does not produce efficient long-term behaviour in stationary bandits. However, in the more challenging switching bandit problem active inference performs substantially better than the two state-of-the-art bandit algorithms. The results open exciting venues for further research in theoretical and applied machine learning, as well as lend additional credibility to active inference as a general framework for studying human and animal behaviour.
△ Less
Submitted 4 August, 2021; v1 submitted 21 January, 2021;
originally announced January 2021.
-
Neuronal Sequence Models for Bayesian Online Inference
Authors:
Sascha Frölich,
Dimitrije Marković,
Stefan J. Kiebel
Abstract:
Sequential neuronal activity underlies a wide range of processes in the brain. Neuroscientific evidence for neuronal sequences has been reported in domains as diverse as perception, motor control, speech, spatial navigation and memory. Consequently, different dynamical principles have been proposed as possible sequence-generating mechanisms. Combining experimental findings with computational conce…
▽ More
Sequential neuronal activity underlies a wide range of processes in the brain. Neuroscientific evidence for neuronal sequences has been reported in domains as diverse as perception, motor control, speech, spatial navigation and memory. Consequently, different dynamical principles have been proposed as possible sequence-generating mechanisms. Combining experimental findings with computational concepts like the Bayesian brain hypothesis and predictive coding leads to the interesting possibility that predictive and inferential processes in the brain are grounded on generative processes which maintain a sequential structure. While probabilistic inference about ongoing sequences is a useful computational model for both the analysis of neuroscientific data and a wide range of problems in artificial recognition and motor control, research on the subject is relatively scarce and distributed over different fields in the neurosciences. Here we review key findings about neuronal sequences and relate these to the concept of online inference on sequences as a model of sensory-motor processing and recognition. We propose that describing sequential neuronal activity as an expression of probabilistic inference over sequences may lead to novel perspectives on brain function. Importantly, it is promising to translate the key idea of probabilistic inference on sequences to machine learning, in order to address challenges in the real-time recognition of speech and human motion.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
Physics for Neuromorphic Computing
Authors:
Danijela Markovic,
Alice Mizrahi,
Damien Querlioz,
Julie Grollier
Abstract:
Neuromorphic computing takes inspiration from the brain to create energy efficient hardware for information processing, capable of highly sophisticated tasks. In this article, we make the case that building this new hardware necessitates reinventing electronics. We show that research in physics and material science will be key to create artificial nano-neurons and synapses, to connect them togethe…
▽ More
Neuromorphic computing takes inspiration from the brain to create energy efficient hardware for information processing, capable of highly sophisticated tasks. In this article, we make the case that building this new hardware necessitates reinventing electronics. We show that research in physics and material science will be key to create artificial nano-neurons and synapses, to connect them together in huge numbers, to organize them in complex systems, and to compute with them efficiently. We describe how some researchers choose to take inspiration from artificial intelligence to move forward in this direction, whereas others prefer taking inspiration from neuroscience, and we highlight recent striking results obtained with these two approaches. Finally, we discuss the challenges and perspectives in neuromorphic physics, which include developing the algorithms and the hardware hand in hand, making significant advances with small toy systems, as well as building large scale networks.
△ Less
Submitted 8 March, 2020;
originally announced March 2020.
-
Body movement to sound interface with vector autoregressive hierarchical hidden Markov models
Authors:
Dimitrije Marković,
Borjana Valčić,
Nebojša Malešević
Abstract:
Interfacing a kinetic action of a person to an action of a machine system is an important research topic in many application areas. One of the key factors for intimate human-machine interaction is the ability of the control algorithm to detect and classify different user commands with shortest possible latency, thus making a highly correlated link between cause and effect. In our research, we focu…
▽ More
Interfacing a kinetic action of a person to an action of a machine system is an important research topic in many application areas. One of the key factors for intimate human-machine interaction is the ability of the control algorithm to detect and classify different user commands with shortest possible latency, thus making a highly correlated link between cause and effect. In our research, we focused on the task of mapping user kinematic actions into sound samples. The presented methodology relies on the wireless sensor nodes equipped with inertial measurement units and the real-time algorithm dedicated for early detection and classification of a variety of movements/gestures performed by a user. The core algorithm is based on the approximate Bayesian inference of Vector Autoregressive Hierarchical Hidden Markov Models (VAR-HHMM), where models database is derived from the set of motion gestures. The performance of the algorithm was compared with an online version of the K-nearest neighbours (KNN) algorithm, where we used offline expert based classification as the benchmark. In almost all of the evaluation metrics (e.g. confusion matrix, recall and precision scores) the VAR-HHMM algorithm outperformed KNN. Furthermore, the VAR-HHMM algorithm, in some cases, achieved faster movement onset detection compared with the offline standard. The proposed concept, although envisioned for movement-to-sound application, could be implemented in other human-machine interfaces.
△ Less
Submitted 26 October, 2016;
originally announced October 2016.
-
Neuropsychological constraints to human data production on a global scale
Authors:
Claudius Gros,
Gregor Kaczor,
Dimitrije Markovic
Abstract:
Which are the factors underlying human information production on a global level? In order to gain an insight into this question we study a corpus of 252-633 Million publicly available data files on the Internet corresponding to an overall storage volume of 284-675 Terabytes. Analyzing the file size distribution for several distinct data types we find indications that the neuropsychological capacit…
▽ More
Which are the factors underlying human information production on a global level? In order to gain an insight into this question we study a corpus of 252-633 Million publicly available data files on the Internet corresponding to an overall storage volume of 284-675 Terabytes. Analyzing the file size distribution for several distinct data types we find indications that the neuropsychological capacity of the human brain to process and record information may constitute the dominant limiting factor for the overall growth of globally stored information, with real-world economic constraints having only a negligible influence. This supposition draws support from the observation that the files size distributions follow a power law for data without a time component, like images, and a log-normal distribution for multimedia files, for which time is a defining qualia.
△ Less
Submitted 27 November, 2011;
originally announced November 2011.