subscribe to arXiv mailings

Fast inference with Kronecker-sparse matrices

Authors: Antoine Gonon, Léon Zheng, Pascal Carrivain, Quoc-Tung Le

Abstract: This paper benchmarks and improves existing GPU matrix multiplication algorithms specialized for Kronecker-sparse matrices, whose sparsity patterns are described by Kronecker products. These matrices have recently gained popularity as replacements for dense matrices in neural networks because they preserve accuracy while using fewer parameters. We present the first energy and time benchmarks for t… ▽ More This paper benchmarks and improves existing GPU matrix multiplication algorithms specialized for Kronecker-sparse matrices, whose sparsity patterns are described by Kronecker products. These matrices have recently gained popularity as replacements for dense matrices in neural networks because they preserve accuracy while using fewer parameters. We present the first energy and time benchmarks for the multiplication with such matrices, helping users identify scenarios where Kronecker-sparse matrices are more time- and energy-efficient than their dense counterparts. Our benchmark also reveals that specialized implementations spend up to 50% of their total runtime on memory rewriting operations. To address the challenge of reducing memory transfers, we introduce a new so-called tiling strategy adapted to the Kronecker-sparsity structure, which reduces reads and writes between levels of GPU memory. We implement this tiling strategy in a new CUDA kernel that achieves a median speed-up of x1.4, while also cutting energy consumption by 15%. We further demonstrate the broader impact of our results by applying the new kernel to accelerate transformer inference. △ Less

Submitted 8 October, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.15006 [pdf, other]

Path-metrics, pruning, and generalization

Authors: Antoine Gonon, Nicolas Brisebarre, Elisa Riccietti, Rémi Gribonval

Abstract: Analyzing the behavior of ReLU neural networks often hinges on understanding the relationships between their parameters and the functions they implement. This paper proves a new bound on function distances in terms of the so-called path-metrics of the parameters. Since this bound is intrinsically invariant with respect to the rescaling symmetries of the networks, it sharpens previously known bound… ▽ More Analyzing the behavior of ReLU neural networks often hinges on understanding the relationships between their parameters and the functions they implement. This paper proves a new bound on function distances in terms of the so-called path-metrics of the parameters. Since this bound is intrinsically invariant with respect to the rescaling symmetries of the networks, it sharpens previously known bounds. It is also, to the best of our knowledge, the first bound of its kind that is broadly applicable to modern networks such as ResNets, VGGs, U-nets, and many more. In contexts such as network pruning and quantization, the proposed path-metrics can be efficiently computed using only two forward passes. Besides its intrinsic theoretical interest, the bound yields not only novel theoretical generalization bounds, but also a promising proof of concept for rescaling-invariant pruning. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2310.01225 [pdf, other]

A path-norm toolkit for modern networks: consequences, promises and challenges

Authors: Antoine Gonon, Nicolas Brisebarre, Elisa Riccietti, Rémi Gribonval

Abstract: This work introduces the first toolkit around path-norms that fully encompasses general DAG ReLU networks with biases, skip connections and any operation based on the extraction of order statistics: max pooling, GroupSort etc. This toolkit notably allows us to establish generalization bounds for modern neural networks that are not only the most widely applicable path-norm based ones, but also reco… ▽ More This work introduces the first toolkit around path-norms that fully encompasses general DAG ReLU networks with biases, skip connections and any operation based on the extraction of order statistics: max pooling, GroupSort etc. This toolkit notably allows us to establish generalization bounds for modern neural networks that are not only the most widely applicable path-norm based ones, but also recover or beat the sharpest known bounds of this type. These extended path-norms further enjoy the usual benefits of path-norms: ease of computation, invariance under the symmetries of the network, and improved sharpness on layered fully-connected networks compared to the product of operator norms, another complexity measure most commonly used. The versatility of the toolkit and its ease of implementation allow us to challenge the concrete promises of path-norm-based generalization bounds, by numerically evaluating the sharpest known bounds for ResNets on ImageNet. △ Less

Submitted 13 March, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

arXiv:2304.10553

Sparsity in neural networks can improve their privacy

Authors: Antoine Gonon, Léon Zheng, Clément Lalanne, Quoc-Tung Le, Guillaume Lauga, Can Pouliquen

Abstract: This article measures how sparsity can make neural networks more robust to membership inference attacks. The obtained empirical results show that sparsity improves the privacy of the network, while preserving comparable performances on the task at hand. This empirical study completes and extends existing literature. This article measures how sparsity can make neural networks more robust to membership inference attacks. The obtained empirical results show that sparsity improves the privacy of the network, while preserving comparable performances on the task at hand. This empirical study completes and extends existing literature. △ Less

Submitted 11 June, 2024; v1 submitted 20 April, 2023; originally announced April 2023.

Comments: arXiv admin note: duplicate of arXiv:2304.07234

arXiv:2304.07234 [pdf, other]

Can sparsity improve the privacy of neural networks?

Authors: Antoine Gonon, Léon Zheng, Clément Lalanne, Quoc-Tung Le, Guillaume Lauga, Can Pouliquen

Abstract: Sparse neural networks are mainly motivated by ressource efficiency since they use fewer parameters than their dense counterparts but still reach comparable accuracies. This article empirically investigates whether sparsity could also improve the privacy of the data used to train the networks. The experiments show positive correlations between the sparsity of the model, its privacy, and its classi… ▽ More Sparse neural networks are mainly motivated by ressource efficiency since they use fewer parameters than their dense counterparts but still reach comparable accuracies. This article empirically investigates whether sparsity could also improve the privacy of the data used to train the networks. The experiments show positive correlations between the sparsity of the model, its privacy, and its classification error. Simply comparing the privacy of two models with different sparsity levels can yield misleading conclusions on the role of sparsity, because of the additional correlation with the classification error. From this perspective, some caveats are raised about previous works that investigate sparsity and privacy. △ Less

Submitted 23 May, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

arXiv:2205.11874 [pdf, other]

Approximation speed of quantized vs. unquantized ReLU neural networks and beyond

Authors: Antoine Gonon, Nicolas Brisebarre, Rémi Gribonval, Elisa Riccietti

Abstract: We deal with two complementary questions about approximation properties of ReLU networks. First, we study how the uniform quantization of ReLU networks with real-valued weights impacts their approximation properties. We establish an upper-bound on the minimal number of bits per coordinate needed for uniformly quantized ReLU networks to keep the same polynomial asymptotic approximation speeds as un… ▽ More We deal with two complementary questions about approximation properties of ReLU networks. First, we study how the uniform quantization of ReLU networks with real-valued weights impacts their approximation properties. We establish an upper-bound on the minimal number of bits per coordinate needed for uniformly quantized ReLU networks to keep the same polynomial asymptotic approximation speeds as unquantized ones. We also characterize the error of nearest-neighbour uniform quantization of ReLU networks. This is achieved using a new lower-bound on the Lipschitz constant of the map that associates the parameters of ReLU networks to their realization, and an upper-bound generalizing classical results. Second, we investigate when ReLU networks can be expected, or not, to have better approximation properties than other classical approximation families. Indeed, several approximation families share the following common limitation: their polynomial asymptotic approximation speed of any set is bounded from above by the encoding speed of this set. We introduce a new abstract property of approximation families, called infinite-encodability, which implies this upper-bound. Many classical approximation families, defined with dictionaries or ReLU networks, are shown to be infinite-encodable. This unifies and generalizes several situations where this upper-bound is known. △ Less

Submitted 7 October, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

arXiv:1910.10024 [pdf, other]

Compressive Learning for Semi-Parametric Models

Authors: Michael P. Sheehan, Antoine Gonon, Mike E. Davies

Abstract: In the compressive learning theory, instead of solving a statistical learning problem from the input data, a so-called sketch is computed from the data prior to learning. The sketch has to capture enough information to solve the problem directly from it, allowing to discard the dataset from the memory. This is useful when dealing with large datasets as the size of the sketch does not scale with th… ▽ More In the compressive learning theory, instead of solving a statistical learning problem from the input data, a so-called sketch is computed from the data prior to learning. The sketch has to capture enough information to solve the problem directly from it, allowing to discard the dataset from the memory. This is useful when dealing with large datasets as the size of the sketch does not scale with the size of the database. In this paper, we reformulate the original compressive learning framework to explicitly cater for the class of semi-parametric models. The reformulation takes account of the inherent topology and structure of semi-parametric models, creating an intuitive pathway to the development of compressive learning algorithms. We apply our developed framework to both the semi-parametric models of independent component analysis and subspace clustering, demonstrating the robustness of the framework to explicitly show when a compression in complexity can be achieved. △ Less

Submitted 22 October, 2019; originally announced October 2019.

Comments: 5 pages, 5 figure, submitted to ICASSP 2020

Showing 1–7 of 7 results for author: Gonon, A