-
Additive spectrum preserving mappings from von Neumann algebras
Authors:
Martin Mathieu,
Francois Schulz
Abstract:
We establish Jafarian's 2009 conjecture that every additive spectrum preserving mapping from a von Neumann algebra onto a semisimple Banach algebra is a Jordan isomorphism.
We establish Jafarian's 2009 conjecture that every additive spectrum preserving mapping from a von Neumann algebra onto a semisimple Banach algebra is a Jordan isomorphism.
△ Less
Submitted 31 December, 2023;
originally announced January 2024.
-
AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
Authors:
Michaël Mathieu,
Sherjil Ozair,
Srivatsan Srinivasan,
Caglar Gulcehre,
Shangtong Zhang,
Ray Jiang,
Tom Le Paine,
Richard Powell,
Konrad Żołna,
Julian Schrittwieser,
David Choi,
Petko Georgiev,
Daniel Toyama,
Aja Huang,
Roman Ring,
Igor Babuschkin,
Timo Ewalds,
Mahyar Bordbar,
Sarah Henderson,
Sergio Gómez Colmenarejo,
Aäron van den Oord,
Wojciech Marian Czarnecki,
Nando de Freitas,
Oriol Vinyals
Abstract:
StarCraft II is one of the most challenging simulated reinforcement learning environments; it is partially observable, stochastic, multi-agent, and mastering StarCraft II requires strategic planning over long time horizons with real-time low-level execution. It also has an active professional competitive scene. StarCraft II is uniquely suited for advancing offline RL algorithms, both because of it…
▽ More
StarCraft II is one of the most challenging simulated reinforcement learning environments; it is partially observable, stochastic, multi-agent, and mastering StarCraft II requires strategic planning over long time horizons with real-time low-level execution. It also has an active professional competitive scene. StarCraft II is uniquely suited for advancing offline RL algorithms, both because of its challenging nature and because Blizzard has released a massive dataset of millions of StarCraft II games played by human players. This paper leverages that and establishes a benchmark, called AlphaStar Unplugged, introducing unprecedented challenges for offline reinforcement learning. We define a dataset (a subset of Blizzard's release), tools standardizing an API for machine learning methods, and an evaluation protocol. We also present baseline agents, including behavior cloning, offline variants of actor-critic and MuZero. We improve the state of the art of agents using only offline data, and we achieve 90% win rate against previously published AlphaStar behavior cloning agent.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Invertibility preserving mappings onto finite C*-algebras
Authors:
Martin Mathieu,
Francois Schulz
Abstract:
We prove that every surjective unital linear mapping which preserves invertible elements from a Banach algebra onto a C*-algebra carrying a faithful tracial state is a Jordan homomorphism thus generalising Aupetit's 1998 result for finite von Neumann algebras.
We prove that every surjective unital linear mapping which preserves invertible elements from a Banach algebra onto a C*-algebra carrying a faithful tracial state is a Jordan homomorphism thus generalising Aupetit's 1998 result for finite von Neumann algebras.
△ Less
Submitted 31 December, 2022;
originally announced January 2023.
-
Region-guided CycleGANs for Stain Transfer in Whole Slide Images
Authors:
Joseph Boyd,
Irène Villa,
Marie-Christine Mathieu,
Eric Deutsch,
Nikos Paragios,
Maria Vakalopoulou,
Stergios Christodoulidis
Abstract:
In whole slide imaging, commonly used staining techniques based on hematoxylin and eosin (H&E) and immunohistochemistry (IHC) stains accentuate different aspects of the tissue landscape. In the case of detecting metastases, IHC provides a distinct readout that is readily interpretable by pathologists. IHC, however, is a more expensive approach and not available at all medical centers. Virtually ge…
▽ More
In whole slide imaging, commonly used staining techniques based on hematoxylin and eosin (H&E) and immunohistochemistry (IHC) stains accentuate different aspects of the tissue landscape. In the case of detecting metastases, IHC provides a distinct readout that is readily interpretable by pathologists. IHC, however, is a more expensive approach and not available at all medical centers. Virtually generating IHC images from H&E using deep neural networks thus becomes an attractive alternative. Deep generative models such as CycleGANs learn a semantically-consistent mapping between two image domains, while emulating the textural properties of each domain. They are therefore a suitable choice for stain transfer applications. However, they remain fully unsupervised, and possess no mechanism for enforcing biological consistency in stain transfer. In this paper, we propose an extension to CycleGANs in the form of a region of interest discriminator. This allows the CycleGAN to learn from unpaired datasets where, in addition, there is a partial annotation of objects for which one wishes to enforce consistency. We present a use case on whole slide images, where an IHC stain provides an experimentally generated signal for metastatic cells. We demonstrate the superiority of our approach over prior art in stain transfer on histopathology tiles over two datasets. Our code and model are available at https://github.com/jcboyd/miccai2022-roigan.
△ Less
Submitted 26 August, 2022;
originally announced August 2022.
-
Schanuel's Lemma for Exact categories
Authors:
Martin Mathieu,
Michael Rosbotham
Abstract:
We prove an injective version of Schanuel's lemma from homological algebra in the setting of exact categories.
We prove an injective version of Schanuel's lemma from homological algebra in the setting of exact categories.
△ Less
Submitted 9 January, 2022;
originally announced January 2022.
-
Open-Ended Learning Leads to Generally Capable Agents
Authors:
Open Ended Learning Team,
Adam Stooke,
Anuj Mahajan,
Catarina Barros,
Charlie Deck,
Jakob Bauer,
Jakub Sygnowski,
Maja Trebacz,
Max Jaderberg,
Michael Mathieu,
Nat McAleese,
Nathalie Bradley-Schmieg,
Nathaniel Wong,
Nicolas Porcel,
Roberta Raileanu,
Steph Hughes-Fitt,
Valentin Dalibard,
Wojciech Marian Czarnecki
Abstract:
In this work we create agents that can perform well beyond a single, individual task, that exhibit much wider generalisation of behaviour to a massive, rich space of challenges. We define a universe of tasks within an environment domain and demonstrate the ability to train agents that are generally capable across this vast space and beyond. The environment is natively multi-agent, spanning the con…
▽ More
In this work we create agents that can perform well beyond a single, individual task, that exhibit much wider generalisation of behaviour to a massive, rich space of challenges. We define a universe of tasks within an environment domain and demonstrate the ability to train agents that are generally capable across this vast space and beyond. The environment is natively multi-agent, spanning the continuum of competitive, cooperative, and independent games, which are situated within procedurally generated physical 3D worlds. The resulting space is exceptionally diverse in terms of the challenges posed to agents, and as such, even measuring the learning progress of an agent is an open research problem. We propose an iterative notion of improvement between successive generations of agents, rather than seeking to maximise a singular objective, allowing us to quantify progress despite tasks being incomparable in terms of achievable rewards. We show that through constructing an open-ended learning process, which dynamically changes the training task distributions and training objectives such that the agent never stops learning, we achieve consistent learning of new behaviours. The resulting agent is able to score reward in every one of our humanly solvable evaluation levels, with behaviour generalising to many held-out points in the universe of tasks. Examples of this zero-shot generalisation include good performance on Hide and Seek, Capture the Flag, and Tag. Through analysis and hand-authored probe tasks we characterise the behaviour of our agent, and find interesting emergent heuristic behaviours such as trial-and-error experimentation, simple tool use, option switching, and cooperation. Finally, we demonstrate that the general capabilities of this agent could unlock larger scale transfer of behaviour through cheap finetuning.
△ Less
Submitted 31 July, 2021; v1 submitted 27 July, 2021;
originally announced July 2021.
-
Exact Structures for Operator Modules
Authors:
Martin Mathieu,
Michael Rosbotham
Abstract:
We demonstrate how exact structures can be placed on the additive category of right operator modules over an operator algebra in order to discuss global dimension for operator algebras. The properties of the Haagerup tensor product play a decisive role in this.
We demonstrate how exact structures can be placed on the additive category of right operator modules over an operator algebra in order to discuss global dimension for operator algebras. The properties of the Haagerup tensor product play a decisive role in this.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
Type-zero ternary corners
Authors:
Yousef Estaremi,
Martin Mathieu
Abstract:
In this paper we discuss the relationship between a TRO $\mathcal{T}$ and a sub-TRO $\mathcal{S}$ that is the range of a TRO-conditional expectation on $\mathcal{T}$, a \textit{ternary corner}, by investigating a special class $\mathcal{D}$ of bounded linear maps on~$\mathcal{T}$. We pay particular attention to the case when the TROs contain partial isometries.
In this paper we discuss the relationship between a TRO $\mathcal{T}$ and a sub-TRO $\mathcal{S}$ that is the range of a TRO-conditional expectation on $\mathcal{T}$, a \textit{ternary corner}, by investigating a special class $\mathcal{D}$ of bounded linear maps on~$\mathcal{T}$. We pay particular attention to the case when the TROs contain partial isometries.
△ Less
Submitted 25 June, 2022; v1 submitted 18 June, 2019;
originally announced June 2019.
-
Strictly singular multiplication operators on $\mathcal L(X)$
Authors:
Martin Mathieu,
Pedro Tradacete
Abstract:
Exploiting several $\ell_p$-factorization results for strictly singular operators, we study the strict singularity of the multiplication operator $L_A R_B\colon T\mapsto ATB$ on $\mathcal L(X)$ for various Banach spaces~$X$.
Exploiting several $\ell_p$-factorization results for strictly singular operators, we study the strict singularity of the multiplication operator $L_A R_B\colon T\mapsto ATB$ on $\mathcal L(X)$ for various Banach spaces~$X$.
△ Less
Submitted 13 May, 2019; v1 submitted 22 October, 2018;
originally announced October 2018.
-
Relative double commutants in coronas of separable C*-algebras
Authors:
Dan Kucerovsky,
Martin Mathieu
Abstract:
We prove a double commutant theorem for separable subalgebras of a wide class of corona C*-algebras, largely resolving a problem posed by Pedersen. Double commutant theorems originated with von Neumann, whose seminal result evolved into an entire field now called von Neumann algebra theory. Voiculescu later proved a C*-algebraic double commutant theorem for subalgebras of the Calkin algebra. We pr…
▽ More
We prove a double commutant theorem for separable subalgebras of a wide class of corona C*-algebras, largely resolving a problem posed by Pedersen. Double commutant theorems originated with von Neumann, whose seminal result evolved into an entire field now called von Neumann algebra theory. Voiculescu later proved a C*-algebraic double commutant theorem for subalgebras of the Calkin algebra. We prove a similar result for subalgebras of a much more general class of so-called corona C*-algebras.
△ Less
Submitted 26 October, 2022; v1 submitted 2 October, 2018;
originally announced October 2018.
-
Disentangling factors of variation in deep representations using adversarial training
Authors:
Michael Mathieu,
Junbo Zhao,
Pablo Sprechmann,
Aditya Ramesh,
Yann LeCun
Abstract:
We introduce a conditional generative model for learning to disentangle the hidden factors of variation within a set of labeled observations, and separate them into complementary codes. One code summarizes the specified factors of variation associated with the labels. The other summarizes the remaining unspecified variability. During training, the only available source of supervision comes from ou…
▽ More
We introduce a conditional generative model for learning to disentangle the hidden factors of variation within a set of labeled observations, and separate them into complementary codes. One code summarizes the specified factors of variation associated with the labels. The other summarizes the remaining unspecified variability. During training, the only available source of supervision comes from our ability to distinguish among different observations belonging to the same class. Examples of such observations include images of a set of labeled objects captured at different viewpoints, or recordings of set of speakers dictating multiple phrases. In both instances, the intra-class diversity is the source of the unspecified factors of variation: each object is observed at multiple viewpoints, and each speaker dictates multiple phrases. Learning to disentangle the specified factors from the unspecified ones becomes easier when strong supervision is possible. Suppose that during training, we have access to pairs of images, where each pair shows two different objects captured from the same viewpoint. This source of alignment allows us to solve our task using existing methods. However, labels for the unspecified factors are usually unavailable in realistic scenarios where data acquisition is not strictly controlled. We address the problem of disentanglement in this more general setting by combining deep convolutional autoencoders with a form of adversarial training. Both factors of variation are implicitly captured in the organization of the learned embedding space, and can be used for solving single-image analogies. Experimental results on synthetic and real datasets show that the proposed method is capable of generalizing to unseen classes and intra-class variabilities.
△ Less
Submitted 10 November, 2016;
originally announced November 2016.
-
Energy-based Generative Adversarial Network
Authors:
Junbo Zhao,
Michael Mathieu,
Yann LeCun
Abstract:
We introduce the "Energy-based Generative Adversarial Network" model (EBGAN) which views the discriminator as an energy function that attributes low energies to the regions near the data manifold and higher energies to other regions. Similar to the probabilistic GANs, a generator is seen as being trained to produce contrastive samples with minimal energies, while the discriminator is trained to as…
▽ More
We introduce the "Energy-based Generative Adversarial Network" model (EBGAN) which views the discriminator as an energy function that attributes low energies to the regions near the data manifold and higher energies to other regions. Similar to the probabilistic GANs, a generator is seen as being trained to produce contrastive samples with minimal energies, while the discriminator is trained to assign high energies to these generated samples. Viewing the discriminator as an energy function allows to use a wide variety of architectures and loss functionals in addition to the usual binary classifier with logistic output. Among them, we show one instantiation of EBGAN framework as using an auto-encoder architecture, with the energy being the reconstruction error, in place of the discriminator. We show that this form of EBGAN exhibits more stable behavior than regular GANs during training. We also show that a single-scale architecture can be trained to generate high-resolution images.
△ Less
Submitted 6 March, 2017; v1 submitted 11 September, 2016;
originally announced September 2016.
-
Deep multi-scale video prediction beyond mean square error
Authors:
Michael Mathieu,
Camille Couprie,
Yann LeCun
Abstract:
Learning to predict future images from a video sequence involves the construction of an internal representation that models the image evolution accurately, and therefore, to some degree, its content and dynamics. This is why pixel-space video prediction may be viewed as a promising avenue for unsupervised feature learning. In addition, while optical flow has been a very studied problem in computer…
▽ More
Learning to predict future images from a video sequence involves the construction of an internal representation that models the image evolution accurately, and therefore, to some degree, its content and dynamics. This is why pixel-space video prediction may be viewed as a promising avenue for unsupervised feature learning. In addition, while optical flow has been a very studied problem in computer vision for a long time, future frame prediction is rarely approached. Still, many vision applications could benefit from the knowledge of the next frames of videos, that does not require the complexity of tracking every pixel trajectories. In this work, we train a convolutional network to generate future frames given an input sequence. To deal with the inherently blurry predictions obtained from the standard Mean Squared Error (MSE) loss function, we propose three different and complementary feature learning strategies: a multi-scale architecture, an adversarial training method, and an image gradient difference loss function. We compare our predictions to different published results based on recurrent neural networks on the UCF101 dataset
△ Less
Submitted 26 February, 2016; v1 submitted 17 November, 2015;
originally announced November 2015.
-
Learning to Linearize Under Uncertainty
Authors:
Ross Goroshin,
Michael Mathieu,
Yann LeCun
Abstract:
Training deep feature hierarchies to solve supervised learning tasks has achieved state of the art performance on many problems in computer vision. However, a principled way in which to train such hierarchies in the unsupervised setting has remained elusive. In this work we suggest a new architecture and loss for training deep feature hierarchies that linearize the transformations observed in unla…
▽ More
Training deep feature hierarchies to solve supervised learning tasks has achieved state of the art performance on many problems in computer vision. However, a principled way in which to train such hierarchies in the unsupervised setting has remained elusive. In this work we suggest a new architecture and loss for training deep feature hierarchies that linearize the transformations observed in unlabeled natural video sequences. This is done by training a generative model to predict video frames. We also address the problem of inherent uncertainty in prediction by introducing latent variables that are non-deterministic functions of the input into the network architecture.
△ Less
Submitted 10 September, 2015; v1 submitted 9 June, 2015;
originally announced June 2015.
-
Stacked What-Where Auto-encoders
Authors:
Junbo Zhao,
Michael Mathieu,
Ross Goroshin,
Yann LeCun
Abstract:
We present a novel architecture, the "stacked what-where auto-encoders" (SWWAE), which integrates discriminative and generative pathways and provides a unified approach to supervised, semi-supervised and unsupervised learning without relying on sampling during training. An instantiation of SWWAE uses a convolutional net (Convnet) (LeCun et al. (1998)) to encode the input, and employs a deconvoluti…
▽ More
We present a novel architecture, the "stacked what-where auto-encoders" (SWWAE), which integrates discriminative and generative pathways and provides a unified approach to supervised, semi-supervised and unsupervised learning without relying on sampling during training. An instantiation of SWWAE uses a convolutional net (Convnet) (LeCun et al. (1998)) to encode the input, and employs a deconvolutional net (Deconvnet) (Zeiler et al. (2010)) to produce the reconstruction. The objective function includes reconstruction terms that induce the hidden states in the Deconvnet to be similar to those of the Convnet. Each pooling layer produces two sets of variables: the "what" which are fed to the next layer, and its complementary variable "where" that are fed to the corresponding layer in the generative decoder.
△ Less
Submitted 14 February, 2016; v1 submitted 8 June, 2015;
originally announced June 2015.
-
Learning Longer Memory in Recurrent Neural Networks
Authors:
Tomas Mikolov,
Armand Joulin,
Sumit Chopra,
Michael Mathieu,
Marc'Aurelio Ranzato
Abstract:
Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due to the so-called vanishing gradient problem. In this paper, we show that learning longer term patterns in real data, such as in natural language, is perfectly…
▽ More
Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due to the so-called vanishing gradient problem. In this paper, we show that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent. This is achieved by using a slight structural modification of the simple recurrent neural network architecture. We encourage some of the hidden units to change their state slowly by making part of the recurrent weight matrix close to identity, thus forming kind of a longer term memory. We evaluate our model in language modeling experiments, where we obtain similar performance to the much more complex Long Short Term Memory (LSTM) networks (Hochreiter & Schmidhuber, 1997).
△ Less
Submitted 16 April, 2015; v1 submitted 24 December, 2014;
originally announced December 2014.
-
Fast Convolutional Nets With fbfft: A GPU Performance Evaluation
Authors:
Nicolas Vasilache,
Jeff Johnson,
Michael Mathieu,
Soumith Chintala,
Serkan Piantino,
Yann LeCun
Abstract:
We examine the performance profile of Convolutional Neural Network training on the current generation of NVIDIA Graphics Processing Units. We introduce two new Fast Fourier Transform convolution implementations: one based on NVIDIA's cuFFT library, and another based on a Facebook authored FFT implementation, fbfft, that provides significant speedups over cuFFT (over 1.5x) for whole CNNs. Both of t…
▽ More
We examine the performance profile of Convolutional Neural Network training on the current generation of NVIDIA Graphics Processing Units. We introduce two new Fast Fourier Transform convolution implementations: one based on NVIDIA's cuFFT library, and another based on a Facebook authored FFT implementation, fbfft, that provides significant speedups over cuFFT (over 1.5x) for whole CNNs. Both of these convolution implementations are available in open source, and are faster than NVIDIA's cuDNN implementation for many common convolutional layers (up to 23.5x for some synthetic kernel configurations). We discuss different performance regimes of convolutions, comparing areas where straightforward time domain convolutions outperform Fourier frequency domain convolutions. Details on algorithmic applications of NVIDIA GPU hardware specifics in the implementation of fbfft are also provided.
△ Less
Submitted 10 April, 2015; v1 submitted 23 December, 2014;
originally announced December 2014.
-
Video (language) modeling: a baseline for generative models of natural videos
Authors:
MarcAurelio Ranzato,
Arthur Szlam,
Joan Bruna,
Michael Mathieu,
Ronan Collobert,
Sumit Chopra
Abstract:
We propose a strong baseline model for unsupervised feature learning using video data. By learning to predict missing frames or extrapolate future frames from an input video sequence, the model discovers both spatial and temporal correlations which are useful to represent complex deformations and motion patterns. The models we propose are largely borrowed from the language modeling literature, and…
▽ More
We propose a strong baseline model for unsupervised feature learning using video data. By learning to predict missing frames or extrapolate future frames from an input video sequence, the model discovers both spatial and temporal correlations which are useful to represent complex deformations and motion patterns. The models we propose are largely borrowed from the language modeling literature, and adapted to the vision domain by quantizing the space of image patches into a large dictionary. We demonstrate the approach on both a filling and a generation task. For the first time, we show that, after training on natural videos, such a model can predict non-trivial motions over short video sequences.
△ Less
Submitted 4 May, 2016; v1 submitted 20 December, 2014;
originally announced December 2014.
-
The Loss Surfaces of Multilayer Networks
Authors:
Anna Choromanska,
Mikael Henaff,
Michael Mathieu,
Gérard Ben Arous,
Yann LeCun
Abstract:
We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network t…
▽ More
We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network through the prism of the results from random matrix theory. We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum. The number of local minima outside that band diminishes exponentially with the size of the network. We empirically verify that the mathematical model exhibits similar behavior as the computer simulations, despite the presence of high dependencies in real networks. We conjecture that both simulated annealing and SGD converge to the band of low critical points, and that all critical points found there are local minima of high quality measured by the test error. This emphasizes a major difference between large- and small-size networks where for the latter poor quality local minima have non-zero probability of being recovered. Finally, we prove that recovering the global minimum becomes harder as the network size increases and that it is in practice irrelevant as global minimum often leads to overfitting.
△ Less
Submitted 21 January, 2015; v1 submitted 30 November, 2014;
originally announced December 2014.
-
Fast Approximation of Rotations and Hessians matrices
Authors:
Michael Mathieu,
Yann LeCun
Abstract:
A new method to represent and approximate rotation matrices is introduced. The method represents approximations of a rotation matrix $Q$ with linearithmic complexity, i.e. with $\frac{1}{2}n\lg(n)$ rotations over pairs of coordinates, arranged in an FFT-like fashion. The approximation is "learned" using gradient descent. It allows to represent symmetric matrices $H$ as $QDQ^T$ where $D$ is a diago…
▽ More
A new method to represent and approximate rotation matrices is introduced. The method represents approximations of a rotation matrix $Q$ with linearithmic complexity, i.e. with $\frac{1}{2}n\lg(n)$ rotations over pairs of coordinates, arranged in an FFT-like fashion. The approximation is "learned" using gradient descent. It allows to represent symmetric matrices $H$ as $QDQ^T$ where $D$ is a diagonal matrix. It can be used to approximate covariance matrix of Gaussian models in order to speed up inference, or to estimate and track the inverse Hessian of an objective function by relating changes in parameters to changes in gradient along the trajectory followed by the optimization procedure. Experiments were conducted to approximate synthetic matrices, covariance matrices of real data, and Hessian matrices of objective functions involved in machine learning problems.
△ Less
Submitted 28 April, 2014;
originally announced April 2014.
-
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
Authors:
Pierre Sermanet,
David Eigen,
Xiang Zhang,
Michael Mathieu,
Rob Fergus,
Yann LeCun
Abstract:
We present an integrated framework for using Convolutional Networks for classification, localization and detection. We show how a multiscale and sliding window approach can be efficiently implemented within a ConvNet. We also introduce a novel deep learning approach to localization by learning to predict object boundaries. Bounding boxes are then accumulated rather than suppressed in order to incr…
▽ More
We present an integrated framework for using Convolutional Networks for classification, localization and detection. We show how a multiscale and sliding window approach can be efficiently implemented within a ConvNet. We also introduce a novel deep learning approach to localization by learning to predict object boundaries. Bounding boxes are then accumulated rather than suppressed in order to increase detection confidence. We show that different tasks can be learned simultaneously using a single shared network. This integrated framework is the winner of the localization task of the ImageNet Large Scale Visual Recognition Challenge 2013 (ILSVRC2013) and obtained very competitive results for the detection and classifications tasks. In post-competition work, we establish a new state of the art for the detection task. Finally, we release a feature extractor from our best model called OverFeat.
△ Less
Submitted 23 February, 2014; v1 submitted 21 December, 2013;
originally announced December 2013.
-
More elementary operators that are spectrally bounded
Authors:
Nadia Boudi,
Martin Mathieu
Abstract:
We discuss some necessary and some sufficient conditions for an elementary operator $x\mapsto\sum_{i=1}^n a_ixb_i$ on a Banach algebra $A$ to be spectrally bounded. In the case of length three, we obtain a complete characterisation when $A$ acts irreducibly on a Banach space of dimension greater than three.
We discuss some necessary and some sufficient conditions for an elementary operator $x\mapsto\sum_{i=1}^n a_ixb_i$ on a Banach algebra $A$ to be spectrally bounded. In the case of length three, we obtain a complete characterisation when $A$ acts irreducibly on a Banach space of dimension greater than three.
△ Less
Submitted 20 December, 2013;
originally announced December 2013.
-
Fast Training of Convolutional Networks through FFTs
Authors:
Michael Mathieu,
Mikael Henaff,
Yann LeCun
Abstract:
Convolutional networks are one of the most widely employed architectures in computer vision and machine learning. In order to leverage their ability to learn complex functions, large amounts of data are required for training. Training a large convolutional network to produce state-of-the-art results can take weeks, even when using modern GPUs. Producing labels using a trained network can also be c…
▽ More
Convolutional networks are one of the most widely employed architectures in computer vision and machine learning. In order to leverage their ability to learn complex functions, large amounts of data are required for training. Training a large convolutional network to produce state-of-the-art results can take weeks, even when using modern GPUs. Producing labels using a trained network can also be costly when dealing with web-scale datasets. In this work, we present a simple algorithm which accelerates training and inference by a significant factor, and can yield improvements of over an order of magnitude compared to existing state-of-the-art implementations. This is done by computing convolutions as pointwise products in the Fourier domain while reusing the same transformed feature map many times. The algorithm is implemented on a GPU architecture and addresses a number of related challenges.
△ Less
Submitted 6 March, 2014; v1 submitted 20 December, 2013;
originally announced December 2013.
-
Locally quasi-nilpotent elementary operators
Authors:
Nadia Boudi,
Martin Mathieu
Abstract:
Let $A$ be a unital dense algebra of linear mappings on a complex vector space $X$. Let $φ=\sum_{i=1}^n M_{a_i,b_i}$ be a locally quasi-nilpotent elementary operator of length $n$ on $A$. We show that, if $\{a_1,\ldots,a_n\}$ is locally linearly independent, then the local dimension of $V(φ)=\spa\{b_ia_j: 1 \leq i,j \leq n\}$ is at most $\frac{n(n-1)}{2}$. If $\lDim V(φ)=\frac{n(n-1)}{2} $, then t…
▽ More
Let $A$ be a unital dense algebra of linear mappings on a complex vector space $X$. Let $φ=\sum_{i=1}^n M_{a_i,b_i}$ be a locally quasi-nilpotent elementary operator of length $n$ on $A$. We show that, if $\{a_1,\ldots,a_n\}$ is locally linearly independent, then the local dimension of $V(φ)=\spa\{b_ia_j: 1 \leq i,j \leq n\}$ is at most $\frac{n(n-1)}{2}$. If $\lDim V(φ)=\frac{n(n-1)}{2} $, then there exists a representation of $φ$ as $φ=\sum_{i=1}^n M_{u_i,v_i}$ with $v_iu_j=0$ for $i\geq j$. Moreover, we give a complete characterization of locally quasi-nilpotent elementary operators of length 3.
△ Less
Submitted 19 December, 2013; v1 submitted 27 February, 2013;
originally announced February 2013.
-
Development of a custom on-line ultrasonic vapour analyzer/flowmeter for the ATLAS inner detector, with application to gaseous tracking and Cherenkov detectors
Authors:
R. Bates,
M. Battistin,
S. Berry,
J. Berthoud,
A. Bitadze,
P. Bonneau,
J. Botelho-Direito,
N. Bousson,
G. Boyd,
G. Bozza,
E. Da Riva,
C. Degeorge,
B. DiGirolamo,
M. Doubek,
J. Godlewski,
G. Hallewell,
S. Katunin,
D. Lombard,
M. Mathieu,
S. McMahon,
K. Nagai,
E. Perez-Rodriguez,
C. Rossi,
A. Rozanov,
V. Vacek
, et al. (2 additional authors not shown)
Abstract:
Precision sound velocity measurements can simultaneously determine binary gas composition and flow. We have developed an analyzer with custom electronics, currently in use in the ATLAS inner detector, with numerous potential applications. The instrument has demonstrated ~0.3% mixture precision for C3F8/C2F6 mixtures and < 10-4 resolution for N2/C3F8 mixtures. Moderate and high flow versions of the…
▽ More
Precision sound velocity measurements can simultaneously determine binary gas composition and flow. We have developed an analyzer with custom electronics, currently in use in the ATLAS inner detector, with numerous potential applications. The instrument has demonstrated ~0.3% mixture precision for C3F8/C2F6 mixtures and < 10-4 resolution for N2/C3F8 mixtures. Moderate and high flow versions of the instrument have demonstrated flow resolutions of +/- 2% F.S. for flows up to 250 l.min-1, and +/- 1.9% F.S. for linear flow velocities up to 15 ms-1; the latter flow approaching that expected in the vapour return of the thermosiphon fluorocarbon coolant recirculator being built for the ATLAS silicon tracker.
△ Less
Submitted 30 October, 2012;
originally announced October 2012.
-
A combined ultrasonic flow meter and binary vapour mixture analyzer for the ATLAS silicon tracker
Authors:
R. Bates,
M. Battistin,
S. Berry,
J. Berthoud,
A. Bitadze,
P. Bonneau,
J. Botelho-Direito,
N. Bousson,
G. Boyd,
G. Bozza,
E. Da Riva,
C. Degeorge,
B. DiGirolamo,
M. Doubek,
D. Giugni,
J. Godlewski,
G. Hallewell,
S. Katunin,
D. Lombard,
M. Mathieu,
S. McMahon,
K. Nagai,
E. Perez-Rodriguez,
C. Rossi,
A. Rozanov
, et al. (3 additional authors not shown)
Abstract:
An upgrade to the ATLAS silicon tracker cooling control system may require a change from C3F8 (octafluoro-propane) evaporative coolant to a blend containing 10-25% of C2F6 (hexafluoro-ethane). Such a change will reduce the evaporation temperature to assure thermal stability following radiation damage accumulated at full LHC luminosity. Central to this upgrade is a new ultrasonic instrument in whic…
▽ More
An upgrade to the ATLAS silicon tracker cooling control system may require a change from C3F8 (octafluoro-propane) evaporative coolant to a blend containing 10-25% of C2F6 (hexafluoro-ethane). Such a change will reduce the evaporation temperature to assure thermal stability following radiation damage accumulated at full LHC luminosity. Central to this upgrade is a new ultrasonic instrument in which sound transit times are continuously measured in opposite directions in flowing gas at known temperature and pressure to deduce the C3F8/C2F6 flow rate and mixture composition. The instrument and its Supervisory, Control and Data Acquisition (SCADA) software are described in this paper. Several geometries for the instrument are in use or under evaluation. An instrument with a pinched axial geometry intended for analysis and measurement of moderate flow rates has demonstrated a mixture resolution of 3.10-3 for C3F8/C2F6 molar mixtures with 20%C2F6, and a flow resolution of 2% of full scale for mass flows up to 30gs-1. In mixtures of widely-differing molecular weight (mw), higher mixture precision is possible: a sensitivity of <5.10-5 to leaks of C3F8 into part of the ATLAS tracker nitrogen envelope (mw difference 160) has been seen. An instrument with an angled sound path geometry has been developed for use at high fluorocarbon mass flow rates of around 1.2 kgs-1 - corresponding to full flow in a new 60kW thermosiphon recirculator under construction for the ATLAS silicon tracker. Extensive computational fluid dynamics studies were performed to determine the preferred geometry (ultrasonic transducer spacing and placement, together with the sound crossing angle with respect to the vapour flow direction). A prototype with 45deg crossing angle has demonstrated a flow resolution of 1.9% of full scale for linear flow velocities up to 15 ms-1. The instrument has many potential applications.
△ Less
Submitted 17 October, 2012;
originally announced October 2012.
-
C*-Segal algebras with order unit
Authors:
Jukka Kauppi,
Martin Mathieu
Abstract:
We introduce the notion of a (noncommutative) C*-Segal algebra as a Banach algebra which is a dense ideal in a C*-algebra. Several basic properties are investigated and, with the aid of the theory of multiplier modules, the structure of C*-Segal algebras with order unit is determined.
We introduce the notion of a (noncommutative) C*-Segal algebra as a Banach algebra which is a dense ideal in a C*-algebra. Several basic properties are investigated and, with the aid of the theory of multiplier modules, the structure of C*-Segal algebras with order unit is determined.
△ Less
Submitted 22 September, 2012; v1 submitted 22 April, 2012;
originally announced April 2012.
-
A Combine On-Line Acoustic Flowmeter and Fluorocarbon Coolant Mixture Analyzer for The ATLAS Silicon Tracker
Authors:
A. Bitadze,
R. Bates,
M. Battistin,
S. Berry,
P. Bonneau,
J. Botelho-Direito,
B. DiGirolamo,
J. Godlewski,
E. Perez-Rodriguez,
L. Zwalinski,
N. Bousson,
G. Hallewell,
M. Mathieu,
A. Rozanov,
G. Boyd,
M. Doubek,
V. Vacek,
M. Vitek,
K. Egorov,
S. Katunin,
S. McMahon,
K. Nagai
Abstract:
An upgrade to the ATLAS silicon tracker cooling control system may require a change from C3F8 (octafluoro-propane) to a blend containing 10-30% of C2F6 (hexafluoro-ethane) to reduce the evaporation temperature and better protect the silicon from cumulative radiation damage with increasing LHC luminosity. Central to this upgrade is a new acoustic instrument for the real-time measurement of the C3F8…
▽ More
An upgrade to the ATLAS silicon tracker cooling control system may require a change from C3F8 (octafluoro-propane) to a blend containing 10-30% of C2F6 (hexafluoro-ethane) to reduce the evaporation temperature and better protect the silicon from cumulative radiation damage with increasing LHC luminosity. Central to this upgrade is a new acoustic instrument for the real-time measurement of the C3F8/C2F6 mixture ratio and flow. The instrument and its Supervisory, Control and Data Acquisition (SCADA) software are described in this paper. The instrument has demonstrated a resolution of 3.10-3 for C3F8/C2F6 mixtures with ~20%C2F6, and flow resolution of 2% of full scale for mass flows up to 30gs-1. In mixtures of widely-differing molecular weight (mw), higher mixture precision is possible: a sensitivity of < 5.10-4 to leaks of C3F8 into the ATLAS pixel detector nitrogen envelope (mw difference 160) has been seen. The instrument has many potential applications, including the analysis of mixtures of hydrocarbons, vapours for semi-conductor manufacture and anaesthesia.
△ Less
Submitted 12 January, 2012;
originally announced January 2012.
-
The second local multiplier algebra of a separable C*-algebra
Authors:
Martin Mathieu
Abstract:
Several examples of (separable) C*-algebras with the property that their second (iterated) local multiplier algebra is strictly larger than the first have been found by various groups of authors over the past few years, thus answering a question originally posed by G. K. Pedersen in 1978. This survey discusses a systematic approach by P. Ara and the author to produce such examples on the one hand;…
▽ More
Several examples of (separable) C*-algebras with the property that their second (iterated) local multiplier algebra is strictly larger than the first have been found by various groups of authors over the past few years, thus answering a question originally posed by G. K. Pedersen in 1978. This survey discusses a systematic approach by P. Ara and the author to produce such examples on the one hand; on the other hand, we present new criteria guaranteeing that the second and the first local multiplier algebra of a separable C*-algebra agree. For this class of C*-algebras, each derivation of the local multiplier algebra is inner.
△ Less
Submitted 31 October, 2011;
originally announced October 2011.
-
Spectral isometries on non-simple C*-algebras
Authors:
Martin Mathieu,
Ahmed R. Sourour
Abstract:
We prove that unital surjective spectral isometries on certain non-simple unital C*-algebras are Jordan isomorphisms. Along the way, we establish several general facts in the setting of semisimple Banach algebras.
We prove that unital surjective spectral isometries on certain non-simple unital C*-algebras are Jordan isomorphisms. Along the way, we establish several general facts in the setting of semisimple Banach algebras.
△ Less
Submitted 31 October, 2011;
originally announced October 2011.
-
Physical Simulation of Inarticulate Robots
Authors:
Guillaume Claret,
Michaël Mathieu,
David Naccache,
Guillaume Seguin
Abstract:
In this note we study the structure and the behavior of inarticulate robots. We introduce a robot that moves by successive revolvings. The robot's structure is analyzed, simulated and discussed in detail.
In this note we study the structure and the behavior of inarticulate robots. We introduce a robot that moves by successive revolvings. The robot's structure is analyzed, simulated and discussed in detail.
△ Less
Submitted 8 April, 2011;
originally announced April 2011.
-
Acoustically bound crystals
Authors:
P. Marmottant,
D. Rabaud,
P. Thibault,
M. Mathieu
Abstract:
In these fluid dynamics videos, we show how bubbles flowing in a thin microchannel interact under an acoustic field. Because of acoustic interactions without direct contact, bubbles self-organize into periodic patterns, and spontaneously form acoustically bound crystals. We also present the interaction with boundaries, equivalent to the interaction with image bubbles, and unravel the peculiar vibr…
▽ More
In these fluid dynamics videos, we show how bubbles flowing in a thin microchannel interact under an acoustic field. Because of acoustic interactions without direct contact, bubbles self-organize into periodic patterns, and spontaneously form acoustically bound crystals. We also present the interaction with boundaries, equivalent to the interaction with image bubbles, and unravel the peculiar vibration modes of the confined bubbles.
△ Less
Submitted 15 October, 2010;
originally announced October 2010.
-
When is the second local multiplier algebra of a C*-algebra equal to the first?
Authors:
Pere Ara,
Martin Mathieu
Abstract:
We discuss necessary as well as sufficient conditions for the second iterated local multiplier algebra of a separable C*-algebra to agree with the first.
We discuss necessary as well as sufficient conditions for the second iterated local multiplier algebra of a separable C*-algebra to agree with the first.
△ Less
Submitted 22 February, 2011; v1 submitted 28 August, 2010;
originally announced August 2010.
-
The Interacting Branching Process as a Simple Model of Innovation
Authors:
Vishal Sood,
Myléne Mathieu,
Amer Shreim,
Peter Grassberger,
Maya Paczuski
Abstract:
We describe innovation in terms of a generalized branching process. Each new invention pairs with any existing one to produce a number of offspring, which is Poisson distributed with mean p. Existing inventions die with probability p/τat each generation. In contrast to mean field results, no phase transition occurs; the chance for survival is finite for all p > 0. For τ= \infty, surviving processe…
▽ More
We describe innovation in terms of a generalized branching process. Each new invention pairs with any existing one to produce a number of offspring, which is Poisson distributed with mean p. Existing inventions die with probability p/τat each generation. In contrast to mean field results, no phase transition occurs; the chance for survival is finite for all p > 0. For τ= \infty, surviving processes exhibit a bottleneck before exploding super-exponentially - a growth consistent with a law of accelerating returns. This behavior persists for finite τ. We analyze, in detail, the asymptotic behavior as p \to 0.
△ Less
Submitted 17 September, 2010; v1 submitted 30 March, 2010;
originally announced March 2010.
-
A Collection of Problems on Spectrally Bounded Operators
Authors:
Martin Mathieu
Abstract:
We discuss several open problems on spectrally bounded operators, some new, some old, adding in a few new insights.
We discuss several open problems on spectrally bounded operators, some new, some old, adding in a few new insights.
△ Less
Submitted 15 October, 2008;
originally announced October 2008.
-
The Maximal C*-Algebra of Quotients as an Operator Bimodule
Authors:
Pere Ara,
Martin Mathieu,
Eduard Ortega
Abstract:
We establish a description of the maximal C*-algebra of quotients of a unital C*-algebra $A$ as a direct limit of spaces of completely bounded bimodule homomorphisms from certain operator submodules of the Haagerup tensor product $A\otimes_h A$ labelled by the essential closed right ideals of $A$ into $A$. In addition the invariance of the construction of the maximal C*-algebra of quotients unde…
▽ More
We establish a description of the maximal C*-algebra of quotients of a unital C*-algebra $A$ as a direct limit of spaces of completely bounded bimodule homomorphisms from certain operator submodules of the Haagerup tensor product $A\otimes_h A$ labelled by the essential closed right ideals of $A$ into $A$. In addition the invariance of the construction of the maximal C*-algebra of quotients under strong Morita equivalence is proved.
△ Less
Submitted 14 October, 2008;
originally announced October 2008.
-
Maximal C*-algebras of quotients and injective envelopes of C*-algebras
Authors:
Pere Ara,
Martin Mathieu
Abstract:
A new C*-enlargement of a C*-algebra $A$ nested between the local multiplier algebra $M_{\text{loc}}(A)$ of $A$ and its injective envelope $I(A)$ is introduced. Various aspects of this maximal C*-algebra of quotients, $Q_{\text{max}}(A)$, are studied, notably in the setting of AW*-algebras. As a by-product we obtain a new example of a type I C*-algebra $A$ such that…
▽ More
A new C*-enlargement of a C*-algebra $A$ nested between the local multiplier algebra $M_{\text{loc}}(A)$ of $A$ and its injective envelope $I(A)$ is introduced. Various aspects of this maximal C*-algebra of quotients, $Q_{\text{max}}(A)$, are studied, notably in the setting of AW*-algebras. As a by-product we obtain a new example of a type I C*-algebra $A$ such that $M_{\text{loc}}(M_{\text{loc}}(A))\ne M_{\text{loc}}(A)$.
△ Less
Submitted 27 April, 2007;
originally announced April 2007.
-
A not so simple local multiplier algebra
Authors:
Pere Ara,
Martin Mathieu
Abstract:
We construct an AF-algebra $A$ such that its local multiplier algebra $M_{\text{loc}}(A)$ does not agree with $M_{\text{loc}}(M_{\text{loc}}(A))$, thus answering a question raised by G.K. Pedersen in 1978.
We construct an AF-algebra $A$ such that its local multiplier algebra $M_{\text{loc}}(A)$ does not agree with $M_{\text{loc}}(M_{\text{loc}}(A))$, thus answering a question raised by G.K. Pedersen in 1978.
△ Less
Submitted 19 September, 2005;
originally announced September 2005.