subscribe to arXiv mailings

Gradient Flossing: Improving Gradient Descent through Dynamic Control of Jacobians

Abstract: Training recurrent neural networks (RNNs) remains a challenge due to the instability of gradients across long time horizons, which can lead to exploding and vanishing gradients. Recent research has linked these problems to the values of Lyapunov exponents for the forward-dynamics, which describe the growth or shrinkage of infinitesimal perturbations. Here, we propose gradient flossing, a novel app… ▽ More Training recurrent neural networks (RNNs) remains a challenge due to the instability of gradients across long time horizons, which can lead to exploding and vanishing gradients. Recent research has linked these problems to the values of Lyapunov exponents for the forward-dynamics, which describe the growth or shrinkage of infinitesimal perturbations. Here, we propose gradient flossing, a novel approach to tackling gradient instability by pushing Lyapunov exponents of the forward dynamics toward zero during learning. We achieve this by regularizing Lyapunov exponents through backpropagation using differentiable linear algebra. This enables us to "floss" the gradients, stabilizing them and thus improving network training. We demonstrate that gradient flossing controls not only the gradient norm but also the condition number of the long-term Jacobian, facilitating multidimensional error feedback propagation. We find that applying gradient flossing prior to training enhances both the success rate and convergence speed for tasks involving long time horizons. For challenging tasks, we show that gradient flossing during training can further increase the time horizon that can be bridged by backpropagation through time. Moreover, we demonstrate the effectiveness of our approach on various RNN architectures and tasks of variable temporal complexity. Additionally, we provide a simple implementation of our gradient flossing algorithm that can be used in practice. Our results indicate that gradient flossing via regularizing Lyapunov exponents can significantly enhance the effectiveness of RNN training and mitigate the exploding and vanishing gradient problem. △ Less

Submitted 28 December, 2023; originally announced December 2023.

Comments: 28 pages, 16 figures, accepted at NeurIPS 2023

arXiv:2312.17216 [pdf, other]

SparseProp: Efficient Event-Based Simulation and Training of Sparse Recurrent Spiking Neural Networks

Authors: Rainer Engelken

Abstract: Spiking Neural Networks (SNNs) are biologically-inspired models that are capable of processing information in streams of action potentials. However, simulating and training SNNs is computationally expensive due to the need to solve large systems of coupled differential equations. In this paper, we introduce SparseProp, a novel event-based algorithm for simulating and training sparse SNNs. Our algo… ▽ More Spiking Neural Networks (SNNs) are biologically-inspired models that are capable of processing information in streams of action potentials. However, simulating and training SNNs is computationally expensive due to the need to solve large systems of coupled differential equations. In this paper, we introduce SparseProp, a novel event-based algorithm for simulating and training sparse SNNs. Our algorithm reduces the computational cost of both the forward and backward pass operations from O(N) to O(log(N)) per network spike, thereby enabling numerically exact simulations of large spiking networks and their efficient training using backpropagation through time. By leveraging the sparsity of the network, SparseProp eliminates the need to iterate through all neurons at each spike, employing efficient state updates instead. We demonstrate the efficacy of SparseProp across several classical integrate-and-fire neuron models, including a simulation of a sparse SNN with one million LIF neurons. This results in a speed-up exceeding four orders of magnitude relative to previous event-based implementations. Our work provides an efficient and exact solution for training large-scale spiking neural networks and opens up new possibilities for building more sophisticated brain-inspired models. △ Less

Submitted 28 December, 2023; originally announced December 2023.

Comments: 10 pages, 4 figures, accepted at NeurIPS

arXiv:2201.09916 [pdf, ps, other]

doi 10.1371/journal.pcbi.1010590

Input correlations impede suppression of chaos and learning in balanced rate networks

Authors: Rainer Engelken, Alessandro Ingrosso, Ramin Khajeh, Sven Goedeke, L. F. Abbott

Abstract: Neural circuits exhibit complex activity patterns, both spontaneously and evoked by external stimuli. Information encoding and learning in neural circuits depend on how well time-varying stimuli can control spontaneous network activity. We show that in firing-rate networks in the balanced state, external control of recurrent dynamics, i.e., the suppression of internally-generated chaotic variabili… ▽ More Neural circuits exhibit complex activity patterns, both spontaneously and evoked by external stimuli. Information encoding and learning in neural circuits depend on how well time-varying stimuli can control spontaneous network activity. We show that in firing-rate networks in the balanced state, external control of recurrent dynamics, i.e., the suppression of internally-generated chaotic variability, strongly depends on correlations in the input. A unique feature of balanced networks is that, because common external input is dynamically canceled by recurrent feedback, it is far easier to suppress chaos with independent inputs into each neuron than through common input. To study this phenomenon we develop a non-stationary dynamic mean-field theory that determines how the activity statistics and largest Lyapunov exponent depend on frequency and amplitude of the input, recurrent coupling strength, and network size, for both common and independent input. We also show that uncorrelated inputs facilitate learning in balanced networks. △ Less

Submitted 24 January, 2022; originally announced January 2022.

arXiv:2006.02427 [pdf, other]

Lyapunov spectra of chaotic recurrent neural networks

Authors: Rainer Engelken, Fred Wolf, L. F. Abbott

Abstract: Brains process information through the collective dynamics of large neural networks. Collective chaos was suggested to underlie the complex ongoing dynamics observed in cerebral cortical circuits and determine the impact and processing of incoming information streams. In dissipative systems, chaotic dynamics takes place on a subset of phase space of reduced dimensionality and is organized by a com… ▽ More Brains process information through the collective dynamics of large neural networks. Collective chaos was suggested to underlie the complex ongoing dynamics observed in cerebral cortical circuits and determine the impact and processing of incoming information streams. In dissipative systems, chaotic dynamics takes place on a subset of phase space of reduced dimensionality and is organized by a complex tangle of stable, neutral and unstable manifolds. Key topological invariants of this phase space structure such as attractor dimension, and Kolmogorov-Sinai entropy so far remained elusive. Here we calculate the complete Lyapunov spectrum of recurrent neural networks. We show that chaos in these networks is extensive with a size-invariant Lyapunov spectrum and characterized by attractor dimensions much smaller than the number of phase space dimensions. We find that near the onset of chaos, for very intense chaos, and discrete-time dynamics, random matrix theory provides analytical approximations to the full Lyapunov spectrum. We show that a generalized time-reversal symmetry of the network dynamics induces a point-symmetry of the Lyapunov spectrum reminiscent of the symplectic structure of chaotic Hamiltonian systems. Fluctuating input reduces both the entropy rate and the attractor dimension. For trained recurrent networks, we find that Lyapunov spectrum analysis provides a quantification of error propagation and stability achieved. Our methods apply to systems of arbitrary connectivity, and we describe a comprehensive set of controls for the accuracy and convergence of Lyapunov exponents. Our results open a novel avenue for characterizing the complex dynamics of recurrent neural networks and the geometry of the corresponding chaotic attractors. They also highlight the potential of Lyapunov spectrum analysis as a diagnostic for machine learning applications of recurrent networks. △ Less

Submitted 3 June, 2020; originally announced June 2020.

Showing 1–4 of 4 results for author: Engelken, R