-
Hadronic cross section measurements with the DAMPE space mission using 20GeV-10TeV cosmic-ray protons and $^4$He
Authors:
F. Alemanno,
Q. An,
P. Azzarello,
F. C. T. Barbato,
P. Bernardini,
X. J. Bi,
I. Cagnoli,
M. S. Cai,
E. Casilli,
E. Catanzani,
J. Chang,
D. Y. Chen,
J. L. Chen,
Z. F. Chen,
P. Coppin,
M. Y. Cui,
T. S. Cui,
Y. X. Cui,
H. T. Dai,
A. De Benedittis,
I. De Mitri,
F. de Palma,
A. Di Giovanni,
Q. Ding,
T. K. Dong
, et al. (126 additional authors not shown)
Abstract:
Precise direct cosmic-ray (CR) measurements provide an important probe to study the energetic particle sources in our Galaxy, and the interstellar environment through which these particles propagate. Uncertainties on hadronic models, ion-nucleon cross sections in particular, are currently the limiting factor towards obtaining more accurate CR ion flux measurements with calorimetric space-based exp…
▽ More
Precise direct cosmic-ray (CR) measurements provide an important probe to study the energetic particle sources in our Galaxy, and the interstellar environment through which these particles propagate. Uncertainties on hadronic models, ion-nucleon cross sections in particular, are currently the limiting factor towards obtaining more accurate CR ion flux measurements with calorimetric space-based experiments. We present an energy-dependent measurement of the inelastic cross section of protons and helium-4 nuclei (alpha particles) on a Bi$_4$Ge$_3$O$_{12}$ target, using 88 months of data collected by the DAMPE space mission. The kinetic energy range per nucleon of the measurement points ranges from 18 GeV to 9 TeV for protons, and from 5 GeV/n to 3 TeV/n for helium-4 nuclei. Our results lead to a significant improvement of the CR flux normalisation. In the case of helium-4, these results correspond to the first cross section measurements on a heavy target material at energies above 10 GeV/n.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Study of silicon photomultipliers for the readout of a lead/scintillating-fiber calorimeter
Authors:
F. Alemanno,
P. Bernardini,
A. Corvaglia,
G. De Matteis,
L. Martina,
A. Miccoli,
M. Panareo,
M. P. Panetta,
C. Pinto,
A. Surdo
Abstract:
The KLOE electromagnetic calorimeter is expected to be reused in the Near Detector complex of the DUNE experiment at Fermilab. The possible substitution of traditional Photomultiplier Tubes (PMTs) with Silicon Photomultipliers (SiPMs) in the refurbished calorimeter is the object of this investigation. A block of the KLOE lead-scintillating fiber calorimeter has been equipped with light guides and…
▽ More
The KLOE electromagnetic calorimeter is expected to be reused in the Near Detector complex of the DUNE experiment at Fermilab. The possible substitution of traditional Photomultiplier Tubes (PMTs) with Silicon Photomultipliers (SiPMs) in the refurbished calorimeter is the object of this investigation. A block of the KLOE lead-scintillating fiber calorimeter has been equipped with light guides and external trigger scintillators. The signals induced by cosmic rays and environmental radioactivity have been collected by SiPM arrays on one side of the calorimeter, and by conventional PMTs on the opposite side. Efficiency, stability, and timing resolution of SiPMs have been studied and compared with KLOE-PMTs performance. Conclusions about the convenience of substituting PMTs with SiPMs are drawn.
△ Less
Submitted 18 June, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
Inverse modeling of time-delayed interactions via the dynamic-entropy formalism
Authors:
Elena Agliari,
Francesco Alemanno,
Adriano Barra,
Michele Castellana,
Daniele Lotito,
Matthieu Piel
Abstract:
Although instantaneous interactions are unphysical, a large variety of maximum entropy statistical inference methods match the model-inferred and the empirically-measured equal-time correlation functions. Focusing on collective motion of active units, this constraint is reasonable when the interaction timescale is much faster than that of the interacting units, as in starling flocks, yet it fails…
▽ More
Although instantaneous interactions are unphysical, a large variety of maximum entropy statistical inference methods match the model-inferred and the empirically-measured equal-time correlation functions. Focusing on collective motion of active units, this constraint is reasonable when the interaction timescale is much faster than that of the interacting units, as in starling flocks, yet it fails in a number of counter examples, as in leukocyte coordination (where signalling proteins diffuse among two cells). Here, we relax this assumption and develop a path integral approach to maximum-entropy framework, which includes delay in signalling. Our method is able to infer the strength of couplings and fields, but also the time required by the couplings to completely transfer information among the units. We demonstrate the validity of our approach providing excellent results on synthetic datasets of non-Markovian trajectories generated by the Heisenberg-Kuramoto and Vicsek models equipped with delayed interactions. As a proof of concept, we also apply the method to experiments on dendritic migration, where matching equal-time correlations results in a significant information loss.
△ Less
Submitted 10 July, 2024; v1 submitted 3 September, 2023;
originally announced September 2023.
-
Regularization, early-stopping and dreaming: a Hopfield-like setup to address generalization and overfitting
Authors:
Elena Agliari,
Francesco Alemanno,
Miriam Aquaro,
Alberto Fachechi
Abstract:
In this work we approach attractor neural networks from a machine learning perspective: we look for optimal network parameters by applying a gradient descent over a regularized loss function. Within this framework, the optimal neuron-interaction matrices turn out to be a class of matrices which correspond to Hebbian kernels revised by a reiterated unlearning protocol. Remarkably, the extent of suc…
▽ More
In this work we approach attractor neural networks from a machine learning perspective: we look for optimal network parameters by applying a gradient descent over a regularized loss function. Within this framework, the optimal neuron-interaction matrices turn out to be a class of matrices which correspond to Hebbian kernels revised by a reiterated unlearning protocol. Remarkably, the extent of such unlearning is proved to be related to the regularization hyperparameter of the loss function and to the training time. Thus, we can design strategies to avoid overfitting that are formulated in terms of regularization and early-stopping tuning. The generalization capabilities of these attractor networks are also investigated: analytical results are obtained for random synthetic datasets, next, the emerging picture is corroborated by numerical experiments that highlight the existence of several regimes (i.e., overfitting, failure and success) as the dataset parameters are varied.
△ Less
Submitted 20 February, 2024; v1 submitted 1 August, 2023;
originally announced August 2023.
-
Ultrametric identities in glassy models of Natural Evolution
Authors:
Elena Agliari,
Francesco Alemanno,
Miriam Aquaro,
Adriano Barra
Abstract:
Spin-glasses constitute a well-grounded framework for evolutionary models. Of particular interest for (some of) these models is the lack of self-averaging of their order parameters (e.g. the Hamming distance between the genomes of two individuals), even in asymptotic limits, much as like the behavior of the overlap between the configurations of two replica in mean-field spin-glasses. In the latter…
▽ More
Spin-glasses constitute a well-grounded framework for evolutionary models. Of particular interest for (some of) these models is the lack of self-averaging of their order parameters (e.g. the Hamming distance between the genomes of two individuals), even in asymptotic limits, much as like the behavior of the overlap between the configurations of two replica in mean-field spin-glasses. In the latter, this lack of self-averaging is related to peculiar fluctuations of the overlap, known as Ghirlanda-Guerra identities and Aizenman-Contucci polynomials, that cover a pivotal role in describing the ultrametric structure of the spin-glass landscape. As for evolutionary models, such identities may therefore be related to a taxonomic classification of individuals, yet a full investigation on their validity is missing. In this paper, we study ultrametric identities in simple cases where solely random mutations take place, while selective pressure is absent, namely in {\em flat landscape} models. In particular, we study three paradigmatic models in this setting: the {\em one parent model} (which, by construction, is ultrametric at the level of single individuals), the {\em homogeneous population model} (which is replica symmetric), and the {\em species formation model} (where a broken-replica scenario emerges at the level of species). We find analytical and numerical evidence that in the first and in the third model nor the Ghirlanda-Guerra neither the Aizenman-Contucci constraints hold, rather a new class of ultrametric identities is satisfied; in the second model all these constraints hold trivially. Very preliminary results on a real biological human genome derived by {\em The 1000 Genome Project Consortium} and on two artificial human genomes (generated by two different types neural networks) seem in better agreement with these new identities rather than the classic ones.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Hopfield model with planted patterns: a teacher-student self-supervised learning model
Authors:
Francesco Alemanno,
Luca Camanzi,
Gianluca Manzan,
Daniele Tantari
Abstract:
While Hopfield networks are known as paradigmatic models for memory storage and retrieval, modern artificial intelligence systems mainly stand on the machine learning paradigm. We show that it is possible to formulate a teacher-student self-supervised learning problem with Boltzmann machines in terms of a suitable generalization of the Hopfield model with structured patterns, where the spin variab…
▽ More
While Hopfield networks are known as paradigmatic models for memory storage and retrieval, modern artificial intelligence systems mainly stand on the machine learning paradigm. We show that it is possible to formulate a teacher-student self-supervised learning problem with Boltzmann machines in terms of a suitable generalization of the Hopfield model with structured patterns, where the spin variables are the machine weights and patterns correspond to the training set's examples. We analyze the learning performance by studying the phase diagram in terms of the training set size, the dataset noise and the inference temperature (i.e. the weight regularization). With a small but informative dataset the machine can learn by memorization. With a noisy dataset, an extensive number of examples above a critical threshold is needed. In this regime the memory storage limits of the system becomes an opportunity for the occurrence of a learning regime in which the system can generalize.
△ Less
Submitted 31 December, 2023; v1 submitted 26 April, 2023;
originally announced April 2023.
-
Measurement of the cosmic p+He energy spectrum from 50 GeV to 0.5 PeV with the DAMPE space mission
Authors:
DAMPE Collaboration,
F. Alemanno,
C. Altomare,
Q. An,
P. Azzarello,
F. C. T. Barbato,
P. Bernardini,
X. J. Bi,
I. Cagnoli,
M. S. Cai,
E. Casilli,
E. Catanzani,
J. Chang,
D. Y. Chen,
J. L. Chen,
Z. F. Chen,
P. Coppin,
M. Y. Cui,
T. S. Cui,
Y. X. Cui,
H. T. Dai,
A. De Benedittis,
I. De Mitri,
F. de Palma,
M. Deliyergiyev
, et al. (130 additional authors not shown)
Abstract:
Recent observations of the light component of the cosmic-ray spectrum have revealed unexpected features that motivate further and more precise measurements up to the highest energies. The Dark Matter Particle Explorer is a satellite-based cosmic-ray experiment that has been operational since December 2015, continuously collecting data on high-energy cosmic particles with very good statistics, ener…
▽ More
Recent observations of the light component of the cosmic-ray spectrum have revealed unexpected features that motivate further and more precise measurements up to the highest energies. The Dark Matter Particle Explorer is a satellite-based cosmic-ray experiment that has been operational since December 2015, continuously collecting data on high-energy cosmic particles with very good statistics, energy resolution, and particle identification capabilities. In this work, the latest measurements of the energy spectrum of proton+helium in the energy range from 46 GeV to 464 TeV are presented. Among the most distinctive features of the spectrum, a spectral hardening at 600 GeV has been observed, along with a softening at 29 TeV measured with a 6.6σ significance. Moreover, the detector features and the analysis approach allowed for the extension of the spectral measurement up to the sub-PeV region. Even if with small statistical significance due to the low number of events, data suggest a new spectral hardening at about 150 TeV.
△ Less
Submitted 14 August, 2024; v1 submitted 31 March, 2023;
originally announced April 2023.
-
Dense Hebbian neural networks: a replica symmetric picture of supervised learning
Authors:
Elena Agliari,
Linda Albanese,
Francesco Alemanno,
Andrea Alessandrelli,
Adriano Barra,
Fosca Giannotti,
Daniele Lotito,
Dino Pedreschi
Abstract:
We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the train…
▽ More
We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the training dataset, network storage and noise, that is valid in the limit of large network size and structureless datasets: these networks may work in a ultra-storage regime (where they can handle a huge amount of patterns, if compared with shallow neural networks) or in a ultra-detection regime (where they can perform pattern recognition at prohibitive signal-to-noise ratios, if compared with shallow neural networks). Guided by the random theory as a reference framework, we also test numerically learning, storing and retrieval capabilities shown by these networks on structured datasets as MNist and Fashion MNist. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate supervised learning in neural networks, beyond the shallow limit, in general.
△ Less
Submitted 2 July, 2023; v1 submitted 25 November, 2022;
originally announced December 2022.
-
Dense Hebbian neural networks: a replica symmetric picture of unsupervised learning
Authors:
Elena Agliari,
Linda Albanese,
Francesco Alemanno,
Andrea Alessandrelli,
Adriano Barra,
Fosca Giannotti,
Daniele Lotito,
Dino Pedreschi
Abstract:
We consider dense, associative neural-networks trained with no supervision and we investigate their computational capabilities analytically, via a statistical-mechanics approach, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as the quality and quantity of the training dataset and the…
▽ More
We consider dense, associative neural-networks trained with no supervision and we investigate their computational capabilities analytically, via a statistical-mechanics approach, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as the quality and quantity of the training dataset and the network storage, valid in the limit of large network size and structureless datasets. Moreover, we establish a bridge between macroscopic observables standardly used in statistical mechanics and loss functions typically used in the machine learning. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate neural networks in general.
△ Less
Submitted 2 July, 2023; v1 submitted 25 November, 2022;
originally announced November 2022.
-
Latest results from the DAMPE space mission
Authors:
Francesca Alemanno
Abstract:
The DArk Matter Particle Explorer (DAMPE) is a space-based particle detector launched on December 17th, 2015 from the Jiuquan Satellite Launch Center (China). The main goals of the DAMPE mission are the study of galactic cosmic rays (CR), the electron-positron energy spectrum, gamma-ray astronomy, and indirect dark matter search. Among its sub-detectors, the deep calorimeter makes DAMPE able to me…
▽ More
The DArk Matter Particle Explorer (DAMPE) is a space-based particle detector launched on December 17th, 2015 from the Jiuquan Satellite Launch Center (China). The main goals of the DAMPE mission are the study of galactic cosmic rays (CR), the electron-positron energy spectrum, gamma-ray astronomy, and indirect dark matter search. Among its sub-detectors, the deep calorimeter makes DAMPE able to measure electrons and gamma-ray spectra up to 10 TeV, and CR nuclei spectra up to hundreds of TeV, with unprecedented energy resolution. This high-energy region is important in order to search for electron-positron sources, for dark matter signatures in space, and to clarify CR acceleration and propagation mechanisms inside our galaxy. A general overview of the DAMPE experiment will be presented in this work, along with its main results and ongoing activities.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Search for relativistic fractionally charged particles in space
Authors:
DAMPE Collaboration,
F. Alemanno,
C. Altomare,
Q. An,
P. Azzarello,
F. C. T. Barbato,
P. Bernardini,
X. J. Bi,
M. S. Cai,
E. Casilli,
E. Catanzani,
J. Chang,
D. Y. Chen,
J. L. Chen,
Z. F. Chen,
M. Y. Cui,
T. S. Cui,
Y. X. Cui,
H. T. Dai,
A. De-Benedittis,
I. De Mitri,
F. de Palma,
M. Deliyergiyev,
A. Di Giovanni,
M. Di Santo
, et al. (126 additional authors not shown)
Abstract:
More than a century after the performance of the oil drop experiment, the possible existence of fractionally charged particles FCP still remains unsettled. The search for FCPs is crucial for some extensions of the Standard Model in particle physics. Most of the previously conducted searches for FCPs in cosmic rays were based on experiments underground or at high altitudes. However, there have been…
▽ More
More than a century after the performance of the oil drop experiment, the possible existence of fractionally charged particles FCP still remains unsettled. The search for FCPs is crucial for some extensions of the Standard Model in particle physics. Most of the previously conducted searches for FCPs in cosmic rays were based on experiments underground or at high altitudes. However, there have been few searches for FCPs in cosmic rays carried out in orbit other than AMS-01 flown by a space shuttle and BESS by a balloon at the top of the atmosphere. In this study, we conduct an FCP search in space based on on-orbit data obtained using the DArk Matter Particle Explorer (DAMPE) satellite over a period of five years. Unlike underground experiments, which require an FCP energy of the order of hundreds of GeV, our FCP search starts at only a few GeV. An upper limit of $6.2\times 10^{-10}~~\mathrm{cm^{-2}sr^{-1} s^{-1}}$ is obtained for the flux. Our results demonstrate that DAMPE exhibits higher sensitivity than experiments of similar types by three orders of magnitude that more stringently restricts the conditions for the existence of FCP in primary cosmic rays.
△ Less
Submitted 9 September, 2022;
originally announced September 2022.
-
Recurrent neural networks that generalize from examples and optimize by dreaming
Authors:
Miriam Aquaro,
Francesco Alemanno,
Ido Kanter,
Fabrizio Durante,
Elena Agliari,
Adriano Barra
Abstract:
The gap between the huge volumes of data needed to train artificial neural networks and the relatively small amount of data needed by their biological counterparts is a central puzzle in machine learning. Here, inspired by biological information-processing, we introduce a generalized Hopfield network where pairwise couplings between neurons are built according to Hebb's prescription for on-line le…
▽ More
The gap between the huge volumes of data needed to train artificial neural networks and the relatively small amount of data needed by their biological counterparts is a central puzzle in machine learning. Here, inspired by biological information-processing, we introduce a generalized Hopfield network where pairwise couplings between neurons are built according to Hebb's prescription for on-line learning and allow also for (suitably stylized) off-line sleeping mechanisms. Moreover, in order to retain a learning framework, here the patterns are not assumed to be available, instead, we let the network experience solely a dataset made of a sample of noisy examples for each pattern. We analyze the model by statistical-mechanics tools and we obtain a quantitative picture of its capabilities as functions of its control parameters: the resulting network is an associative memory for pattern recognition that learns from examples on-line, generalizes and optimizes its storage capacity by off-line sleeping. Remarkably, the sleeping mechanisms always significantly reduce (up to $\approx 90\%$) the dataset size required to correctly generalize, further, there are memory loads that are prohibitive to Hebbian networks without sleeping (no matter the size and quality of the provided examples), but that are easily handled by the present "rested" neural networks.
△ Less
Submitted 17 April, 2022;
originally announced April 2022.
-
Supervised Hebbian Learning
Authors:
Francesco Alemanno,
Miriam Aquaro,
Ido Kanter,
Adriano Barra,
Elena Agliari
Abstract:
In neural network's Literature, Hebbian learning traditionally refers to the procedure by which the Hopfield model and its generalizations store archetypes (i.e., definite patterns that are experienced just once to form the synaptic matrix). However, the term "Learning" in Machine Learning refers to the ability of the machine to extract features from the supplied dataset (e.g., made of blurred exa…
▽ More
In neural network's Literature, Hebbian learning traditionally refers to the procedure by which the Hopfield model and its generalizations store archetypes (i.e., definite patterns that are experienced just once to form the synaptic matrix). However, the term "Learning" in Machine Learning refers to the ability of the machine to extract features from the supplied dataset (e.g., made of blurred examples of these archetypes), in order to make its own representation of the unavailable archetypes. Here, given a sample of examples, we define a supervised learning protocol by which the Hopfield network can infer the archetypes, and we detect the correct control parameters (including size and quality of the dataset) to depict a phase diagram for the system performance. We also prove that, for structureless datasets, the Hopfield model equipped with this supervised learning rule is equivalent to a restricted Boltzmann machine and this suggests an optimal and interpretable training routine. Finally, this approach is generalized to structured datasets: we highlight a quasi-ultrametric organization (reminiscent of replica-symmetry-breaking) in the analyzed datasets and, consequently, we introduce an additional "replica hidden layer" for its (partial) disentanglement, which is shown to improve MNIST classification from 75% to 95%, and to offer a new perspective on deep architectures.
△ Less
Submitted 7 September, 2022; v1 submitted 2 March, 2022;
originally announced March 2022.
-
Search for gamma-ray spectral lines with the DArk Matter Particle Explorer
Authors:
Francesca Alemanno,
Qi An,
Philipp Azzarello,
Felicia Carla Tiziana Barbato,
Paolo Bernardini,
Xiao-Jun Bi,
Ming-Sheng Cai,
Elisabetta Casilli,
Enrico Catanzani,
Jin Chang,
Deng-Yi Chen,
Jun-Ling Chen,
Zhan-Fang Chen,
Ming-Yang Cui,
Tian-Shu Cui,
Yu-Xing Cui,
Hao-Ting Dai,
Antonio De Benedittis,
Ivan De Mitri,
Francesco de Palma,
Maksym Deliyergiyev,
Margherita Di Santo,
Qi Ding,
Tie-Kuang Dong,
Zhen-Xing Dong
, et al. (121 additional authors not shown)
Abstract:
The DArk Matter Particle Explorer (DAMPE) is well suitable for searching for monochromatic and sharp $γ$-ray structures in the GeV$-$TeV range thanks to its unprecedented high energy resolution. In this work, we search for $γ$-ray line structures using five years of DAMPE data. To improve the sensitivity, we develop two types of dedicated data sets (including the BgoOnly data which is the first ti…
▽ More
The DArk Matter Particle Explorer (DAMPE) is well suitable for searching for monochromatic and sharp $γ$-ray structures in the GeV$-$TeV range thanks to its unprecedented high energy resolution. In this work, we search for $γ$-ray line structures using five years of DAMPE data. To improve the sensitivity, we develop two types of dedicated data sets (including the BgoOnly data which is the first time to be used in the data analysis for the calorimeter-based gamma-ray observatories) and adopt the signal-to-noise ratio optimized regions of interest (ROIs) for different DM density profiles. No line signals or candidates are found between 10 and 300 GeV in the Galaxy. The constraints on the velocity-averaged cross section for $χχ\to γγ$ and the decay lifetime for $χ\to γν$, both at 95% confidence level, have been calculated and the systematic uncertainties have been taken into account. Comparing to the previous Fermi-LAT results, though DAMPE has an acceptance smaller by a factor of $\sim 10$, similar constraints on the DM parameters are achieved and below 100 GeV the lower limits on the decay lifetime are even stronger by a factor of a few. Our results demonstrate the potential of high-energy-resolution observations on dark matter detection.
△ Less
Submitted 6 December, 2022; v1 submitted 16 December, 2021;
originally announced December 2021.
-
Replica symmetry breaking in dense neural networks
Authors:
Linda Albanese,
Francesco Alemanno,
Andrea Alessandrelli,
Adriano Barra
Abstract:
Understanding the glassy nature of neural networks is pivotal both for theoretical and computational advances in Machine Learning and Theoretical Artificial Intelligence. Keeping the focus on dense associative Hebbian neural networks, the purpose of this paper is two-fold: at first we develop rigorous mathematical approaches to address properly a statistical mechanical picture of the phenomenon of…
▽ More
Understanding the glassy nature of neural networks is pivotal both for theoretical and computational advances in Machine Learning and Theoretical Artificial Intelligence. Keeping the focus on dense associative Hebbian neural networks, the purpose of this paper is two-fold: at first we develop rigorous mathematical approaches to address properly a statistical mechanical picture of the phenomenon of {\em replica symmetry breaking} (RSB) in these networks, then -- deepening results stemmed via these routes -- we aim to inspect the {\em glassiness} that they hide. In particular, regarding the methodology, we provide two techniques: the former is an adaptation of the transport PDE to the case, while the latter is an extension of Guerra's interpolation breakthrough. Beyond coherence among the results, either in replica symmetric and in the one-step replica symmetry breaking level of description, we prove the Gardner's picture and we identify the maximal storage capacity by a ground-state analysis in the Baldi-Venkatesh high-storage regime.
In the second part of the paper we investigate the glassy structure of these networks: in contrast with the replica symmetric scenario (RS), RSB actually stabilizes the spin-glass phase. We report huge differences w.r.t. the standard pairwise Hopfield limit: in particular, it is known that it is possible to express the free energy of the Hopfield neural network as a linear combination of the free energies of an hard spin glass (i.e. the Sherrington-Kirkpatrick model) and a soft spin glass (the Gaussian or "spherical" model). This is no longer true when interactions are more than pairwise (whatever the level of description, RS or RSB): for dense networks solely the free energy of the hard spin glass survives, proving a huge diversity in the underlying glassiness of associative neural networks.
△ Less
Submitted 25 November, 2021;
originally announced November 2021.
-
Observations of Forbush Decreases of cosmic ray electrons and positrons with the Dark Matter Particle Explorer
Authors:
Francesca Alemanno,
Qi An,
Philipp Azzarello,
Felicia Carla Tiziana Barbato,
Paolo Bernardini,
XiaoJun Bi,
MingSheng Cai,
Elisabetta Casilli,
Enrico Catanzani,
Jin Chang,
DengYi Chen,
JunLing Chen,
ZhanFang Chen,
MingYang Cui,
TianShu Cui,
YuXing Cui,
HaoTing Dai,
Antonio De Benedittis,
Ivan De Mitri,
Francesco de Palma,
Maksym Deliyergiyev,
Margherita Di Santo,
Qi Ding,
TieKuang Dong,
ZhenXing Dong
, et al. (124 additional authors not shown)
Abstract:
The Forbush Decrease (FD) represents the rapid decrease of the intensities of charged particles accompanied with the coronal mass ejections (CMEs) or high-speed streams from coronal holes. It has been mainly explored with ground-based neutron monitors network which indirectly measure the integrated intensities of all species of cosmic rays by counting secondary neutrons produced from interaction b…
▽ More
The Forbush Decrease (FD) represents the rapid decrease of the intensities of charged particles accompanied with the coronal mass ejections (CMEs) or high-speed streams from coronal holes. It has been mainly explored with ground-based neutron monitors network which indirectly measure the integrated intensities of all species of cosmic rays by counting secondary neutrons produced from interaction between atmosphere atoms and cosmic rays. The space-based experiments can resolve the species of particles but the energy ranges are limited by the relative small acceptances except for the most abundant particles like protons and helium. Therefore, the FD of cosmic ray electrons and positrons have just been investigated by the PAMELA experiment in the low energy range ($<5$ GeV) with limited statistics. In this paper, we study the FD event occurred in September, 2017, with the electron and positron data recorded by the Dark Matter Particle Explorer. The evolution of the FDs from 2 GeV to 20 GeV with a time resolution of 6 hours are given. We observe two solar energetic particle events in the time profile of the intensity of cosmic rays, the earlier and weak one has not been shown in the neutron monitor data. Furthermore, both the amplitude and recovery time of fluxes of electrons and positrons show clear energy-dependence, which is important in probing the disturbances of the interplanetary environment by the coronal mass ejections.
△ Less
Submitted 30 September, 2021;
originally announced October 2021.
-
The emergence of a concept in shallow neural networks
Authors:
Elena Agliari,
Francesco Alemanno,
Adriano Barra,
Giordano De Marzo
Abstract:
We consider restricted Boltzmann machine (RBMs) trained over an unstructured dataset made of blurred copies of definite but unavailable ``archetypes'' and we show that there exists a critical sample size beyond which the RBM can learn archetypes, namely the machine can successfully play as a generative model or as a classifier, according to the operational routine. In general, assessing a critical…
▽ More
We consider restricted Boltzmann machine (RBMs) trained over an unstructured dataset made of blurred copies of definite but unavailable ``archetypes'' and we show that there exists a critical sample size beyond which the RBM can learn archetypes, namely the machine can successfully play as a generative model or as a classifier, according to the operational routine. In general, assessing a critical sample size (possibly in relation to the quality of the dataset) is still an open problem in machine learning. Here, restricting to the random theory, where shallow networks suffice and the grand-mother cell scenario is correct, we leverage the formal equivalence between RBMs and Hopfield networks, to obtain a phase diagram for both the neural architectures which highlights regions, in the space of the control parameters (i.e., number of archetypes, number of neurons, size and quality of the training set), where learning can be accomplished. Our investigations are led by analytical methods based on the statistical-mechanics of disordered systems and results are further corroborated by extensive Monte Carlo simulations.
△ Less
Submitted 1 September, 2021;
originally announced September 2021.
-
Pattern recognition in Deep Boltzmann machines
Authors:
Elena Agliari,
Linda Albanese,
Francesco Alemanno,
Alberto Fachechi
Abstract:
We consider a multi-layer Sherrington-Kirkpatrick spin-glass as a model for deep restricted Boltzmann machines and we solve for its quenched free energy, in the thermodynamic limit and allowing for a first step of replica symmetry breaking. This result is accomplished rigorously exploiting interpolating techniques and recovering the expression already known for the replica-symmetry case. Further,…
▽ More
We consider a multi-layer Sherrington-Kirkpatrick spin-glass as a model for deep restricted Boltzmann machines and we solve for its quenched free energy, in the thermodynamic limit and allowing for a first step of replica symmetry breaking. This result is accomplished rigorously exploiting interpolating techniques and recovering the expression already known for the replica-symmetry case. Further, we drop the restriction constraint by introducing intra-layer connections among spins and we show that the resulting system can be mapped into a modular Hopfield network, which is also addressed rigorously via interpolating techniques up to the first step of replica symmetry breaking.
△ Less
Submitted 16 June, 2021;
originally announced June 2021.
-
Measurement of the cosmic ray helium energy spectrum from 70 GeV to 80 TeV with the DAMPE space mission
Authors:
F. Alemanno,
Q. An,
P. Azzarello,
F. C. T. Barbato,
P. Bernardini,
X. J. Bi,
M. S. Cai,
E. Catanzani,
J. Chang,
D. Y. Chen,
J. L. Chen,
Z. F. Chen,
M. Y. Cui,
T. S. Cui,
Y. X. Cui,
H. T. Dai,
A. D'Amone,
A. De Benedittis,
I. De Mitri,
F. de Palma,
M. Deliyergiyev,
M. Di Santo,
T. K. Dong,
Z. X. Dong,
G. Donvito
, et al. (120 additional authors not shown)
Abstract:
The measurement of the energy spectrum of cosmic ray helium nuclei from 70 GeV to 80 TeV using 4.5 years of data recorded by the DArk Matter Particle Explorer (DAMPE) is reported in this work. A hardening of the spectrum is observed at an energy of about 1.3 TeV, similar to previous observations. In addition, a spectral softening at about 34 TeV is revealed for the first time with large statistics…
▽ More
The measurement of the energy spectrum of cosmic ray helium nuclei from 70 GeV to 80 TeV using 4.5 years of data recorded by the DArk Matter Particle Explorer (DAMPE) is reported in this work. A hardening of the spectrum is observed at an energy of about 1.3 TeV, similar to previous observations. In addition, a spectral softening at about 34 TeV is revealed for the first time with large statistics and well controlled systematic uncertainties, with an overall significance of $4.3σ$. The DAMPE spectral measurements of both cosmic protons and helium nuclei suggest a particle charge dependent softening energy, although with current uncertainties a dependence on the number of nucleons cannot be ruled out.
△ Less
Submitted 21 May, 2021; v1 submitted 19 May, 2021;
originally announced May 2021.
-
A neural network classifier for electron identification on the DAMPE experiment
Authors:
David Droz,
Andrii Tykhonov,
Xin Wu,
Francesca Alemanno,
Giovanni Ambrosi,
Enrico Catanzani,
Margherita Di Santo,
Dimitrios Kyratzis,
Stephan Zimmer
Abstract:
The Dark Matter Particle Explorer (DAMPE) is a space-borne particle detector and cosmic ray observatory in operation since 2015, designed to probe electrons and gamma rays from a few GeV to 10 TeV energy, as well as cosmic protons and nuclei up to 100 TeV. Among the main scientific objectives is the precise measurement of the cosmic electron+positron flux, which due to the very large proton backgr…
▽ More
The Dark Matter Particle Explorer (DAMPE) is a space-borne particle detector and cosmic ray observatory in operation since 2015, designed to probe electrons and gamma rays from a few GeV to 10 TeV energy, as well as cosmic protons and nuclei up to 100 TeV. Among the main scientific objectives is the precise measurement of the cosmic electron+positron flux, which due to the very large proton background in orbit requires a powerful particle identification method. In the past decade, the field of machine learning has provided us the needed tools. This paper presents a neural network based approach to cosmic electron identification and proton rejection and showcases its performances based on simulated Monte Carlo data. The neural network reaches significantly lower background than the classical, cut-based method for the same detection efficiency, especially at highest energies. A good matching between simulations and real data completes the picture.
△ Less
Submitted 11 May, 2021; v1 submitted 10 February, 2021;
originally announced February 2021.
-
Comparison of proton shower developments in the BGO calorimeter of the Dark Matter Particle Explorer between GEANT4 and FLUKA simulations
Authors:
Wei Jiang,
Chuan Yue,
Ming-Yang Cui,
Xiang Li,
Qiang Yuan,
Francesca Alemanno,
Paolo Bernardini,
Giovanni Catanzani,
Zhan-Fang Chen,
Ivan De Mitri,
Tie-Kuang Dong,
Giacinto Donvito,
David Francois Droz,
Piergiorgio Fusco,
Fabio Gargano,
Dong-Ya Guo,
Dimitrios Kyratzis,
Shi-Jun Lei,
Yang Liu,
Francesco Loparco,
Peng-Xiong Ma,
Giovanni Marsella,
Mario Nicola Mazziotta,
Xu Pan,
Wen-Xi Peng
, et al. (8 additional authors not shown)
Abstract:
The DArk Matter Particle Explorer (DAMPE) is a satellite-borne detector for high-energy cosmic rays and $γ$-rays. To fully understand the detector performance and obtain reliable physical results, extensive simulations of the detector are necessary. The simulations are particularly important for the data analysis of cosmic ray nuclei, which relies closely on the hadronic and nuclear interactions o…
▽ More
The DArk Matter Particle Explorer (DAMPE) is a satellite-borne detector for high-energy cosmic rays and $γ$-rays. To fully understand the detector performance and obtain reliable physical results, extensive simulations of the detector are necessary. The simulations are particularly important for the data analysis of cosmic ray nuclei, which relies closely on the hadronic and nuclear interactions of particles in the detector material. Widely adopted simulation softwares include the GEANT4 and FLUKA, both of which have been implemented for the DAMPE simulation tool. Here we describe the simulation tool of DAMPE and compare the results of proton shower properties in the calorimeter from the two simulation softwares. Such a comparison gives an estimate of the most significant uncertainties of our proton spectral analysis.
△ Less
Submitted 27 September, 2020;
originally announced September 2020.
-
Correction Method for the Readout Saturation of the DAMPE Calorimeter
Authors:
Chuan Yue,
Peng-Xiong Ma,
Margherita Di Santo,
Li-Bo Wu,
Francesca Alemanno,
Paolo Bernardini,
Dimitrios Kyratzis,
Guan-Wen Yuan,
Qiang Yuan,
Yun-Long Zhang
Abstract:
The DArk Matter Particle Explorer (DAMPE) is a space-borne high energy cosmic-ray and $γ$-ray detector which operates smoothly since the launch on December 17, 2015. The bismuth germanium oxide (BGO) calorimeter is one of the key sub-detectors of DAMPE used for energy measurement and electron proton identification. For events with total energy deposit higher than decades of TeV, the readouts of PM…
▽ More
The DArk Matter Particle Explorer (DAMPE) is a space-borne high energy cosmic-ray and $γ$-ray detector which operates smoothly since the launch on December 17, 2015. The bismuth germanium oxide (BGO) calorimeter is one of the key sub-detectors of DAMPE used for energy measurement and electron proton identification. For events with total energy deposit higher than decades of TeV, the readouts of PMTs coupled on the BGO crystals would become saturated, which results in an underestimation of the energy measurement. Based on detailed simulations, we develop a correction method for the saturation effect according to the shower development topologies and energies measured by neighbouring BGO crystals. The verification with simulated and on-orbit events shows that this method can well reconstruct the energy deposit in the saturated BGO crystal.
△ Less
Submitted 20 September, 2020;
originally announced September 2020.
-
Interpolating between boolean and extremely high noisy patterns through Minimal Dense Associative Memories
Authors:
Francesco Alemanno,
Martino Centonze,
Alberto Fachechi
Abstract:
Recently, Hopfield and Krotov introduced the concept of {\em dense associative memories} [DAM] (close to spin-glasses with $P$-wise interactions in a disordered statistical mechanical jargon): they proved a number of remarkable features these networks share and suggested their use to (partially) explain the success of the new generation of Artificial Intelligence. Thanks to a remarkable ante-litte…
▽ More
Recently, Hopfield and Krotov introduced the concept of {\em dense associative memories} [DAM] (close to spin-glasses with $P$-wise interactions in a disordered statistical mechanical jargon): they proved a number of remarkable features these networks share and suggested their use to (partially) explain the success of the new generation of Artificial Intelligence. Thanks to a remarkable ante-litteram analysis by Baldi \& Venkatesh, among these properties, it is known these networks can handle a maximal amount of stored patterns $K$ scaling as $K \sim N^{P-1}$.\\ In this paper, once introduced a {\em minimal dense associative network} as one of the most elementary cost-functions falling in this class of DAM, we sacrifice this high-load regime -namely we force the storage of {\em solely} a linear amount of patterns, i.e. $K = αN$ (with $α>0$)- to prove that, in this regime, these networks can correctly perform pattern recognition even if pattern signal is $O(1)$ and is embedded in a sea of noise $O(\sqrt{N})$, also in the large $N$ limit. To prove this statement, by extremizing the quenched free-energy of the model over its natural order-parameters (the various magnetizations and overlaps), we derived its phase diagram, at the replica symmetric level of description and in the thermodynamic limit: as a sideline, we stress that, to achieve this task, aiming at cross-fertilization among disciplines, we pave two hegemon routes in the statistical mechanics of spin glasses, namely the replica trick and the interpolation technique.\\ Both the approaches reach the same conclusion: there is a not-empty region, in the noise-$T$ vs load-$α$ phase diagram plane, where these networks can actually work in this challenging regime; in particular we obtained a quite high critical (linear) load in the (fast) noiseless case resulting in $\lim_{β\to \infty}α_c(β)=0.65$.
△ Less
Submitted 2 December, 2019;
originally announced December 2019.
-
Generalized Guerra's interpolation schemes for dense associative neural networks
Authors:
Elena Agliari,
Francesco Alemanno,
Adriano Barra,
Alberto Fachechi
Abstract:
In this work we develop analytical techniques to investigate a broad class of associative neural networks set in the high-storage regime. These techniques translate the original statistical-mechanical problem into an analytical-mechanical one which implies solving a set of partial differential equations, rather than tackling the canonical probabilistic route. We test the method on the classical Ho…
▽ More
In this work we develop analytical techniques to investigate a broad class of associative neural networks set in the high-storage regime. These techniques translate the original statistical-mechanical problem into an analytical-mechanical one which implies solving a set of partial differential equations, rather than tackling the canonical probabilistic route. We test the method on the classical Hopfield model - where the cost function includes only two-body interactions (i.e., quadratic terms) - and on the "relativistic" Hopfield model - where the (expansion of the) cost function includes p-body (i.e., of degree p) contributions. Under the replica symmetric assumption, we paint the phase diagrams of these models by obtaining the explicit expression of their free energy as a function of the model parameters (i.e., noise level and memory storage). Further, since for non-pairwise models ergodicity breaking is non necessarily a critical phenomenon, we develop a fluctuation analysis and find that criticality is preserved in the relativistic model.
△ Less
Submitted 16 April, 2020; v1 submitted 28 November, 2019;
originally announced November 2019.
-
Neural networks with redundant representation: detecting the undetectable
Authors:
Elena Agliari,
Francesco Alemanno,
Adriano Barra,
Martino Centonze,
Alberto Fachechi
Abstract:
We consider a three-layer Sejnowski machine and show that features learnt via contrastive divergence have a dual representation as patterns in a dense associative memory of order P=4. The latter is known to be able to Hebbian-store an amount of patterns scaling as N^{P-1}, where N denotes the number of constituting binary neurons interacting P-wisely. We also prove that, by keeping the dense assoc…
▽ More
We consider a three-layer Sejnowski machine and show that features learnt via contrastive divergence have a dual representation as patterns in a dense associative memory of order P=4. The latter is known to be able to Hebbian-store an amount of patterns scaling as N^{P-1}, where N denotes the number of constituting binary neurons interacting P-wisely. We also prove that, by keeping the dense associative network far from the saturation regime (namely, allowing for a number of patterns scaling only linearly with N, while P>2) such a system is able to perform pattern recognition far below the standard signal-to-noise threshold. In particular, a network with P=4 is able to retrieve information whose intensity is O(1) even in the presence of a noise O(\sqrt{N}) in the large N limit. This striking skill stems from a redundancy representation of patterns -- which is afforded given the (relatively) low-load information storage -- and it contributes to explain the impressive abilities in pattern recognition exhibited by new-generation neural networks. The whole theory is developed rigorously, at the replica symmetric level of approximation, and corroborated by signal-to-noise analysis and Monte Carlo simulations.
△ Less
Submitted 28 November, 2019;
originally announced November 2019.
-
Dreaming neural networks: rigorous results
Authors:
Elena Agliari,
Francesco Alemanno,
Adriano Barra,
Alberto Fachechi
Abstract:
Recently a daily routine for associative neural networks has been proposed: the network Hebbian-learns during the awake state (thus behaving as a standard Hopfield model), then, during its sleep state, optimizing information storage, it consolidates pure patterns and removes spurious ones: this forces the synaptic matrix to collapse to the projector one (ultimately approaching the Kanter-Sompolink…
▽ More
Recently a daily routine for associative neural networks has been proposed: the network Hebbian-learns during the awake state (thus behaving as a standard Hopfield model), then, during its sleep state, optimizing information storage, it consolidates pure patterns and removes spurious ones: this forces the synaptic matrix to collapse to the projector one (ultimately approaching the Kanter-Sompolinksy model). This procedure keeps the learning Hebbian-based (a biological must) but, by taking advantage of a (properly stylized) sleep phase, still reaches the maximal critical capacity (for symmetric interactions). So far this emerging picture (as well as the bulk of papers on unlearning techniques) was supported solely by mathematically-challenging routes, e.g. mainly replica-trick analysis and numerical simulations: here we rely extensively on Guerra's interpolation techniques developed for neural networks and, in particular, we extend the generalized stochastic stability approach to the case. Confining our description within the replica symmetric approximation (where the previous ones lie), the picture painted regarding this generalization (and the previously existing variations on theme) is here entirely confirmed. Further, still relying on Guerra's schemes, we develop a systematic fluctuation analysis to check where ergodicity is broken (an analysis entirely absent in previous investigations). We find that, as long as the network is awake, ergodicity is bounded by the Amit-Gutfreund-Sompolinsky critical line (as it should), but, as the network sleeps, sleeping destroys spin glass states by extending both the retrieval as well as the ergodic region: after an entire sleeping session the solely surviving regions are retrieval and ergodic ones and this allows the network to achieve the perfect retrieval regime (the number of storable patterns equals the number of neurons in the network).
△ Less
Submitted 21 December, 2018;
originally announced December 2018.
-
A novel derivation of the Marchenko-Pastur law through analog bipartite spin-glasses
Authors:
Elena Agliari,
Francesco Alemanno,
Adriano Barra,
Alberto Fachechi
Abstract:
In this work we consider the {\em analog bipartite spin-glass} (or {\em real-valued restricted Boltzmann machine} in a neural network jargon), whose variables (those quenched as well as those dynamical) share standard Gaussian distributions. First, via Guerra's interpolation technique, we express its quenched free energy in terms of the natural order parameters of the theory (namely the self- and…
▽ More
In this work we consider the {\em analog bipartite spin-glass} (or {\em real-valued restricted Boltzmann machine} in a neural network jargon), whose variables (those quenched as well as those dynamical) share standard Gaussian distributions. First, via Guerra's interpolation technique, we express its quenched free energy in terms of the natural order parameters of the theory (namely the self- and two-replica overlaps), then, we re-obtain the same result by using the replica-trick: a mandatory tribute, given the special occasion. Next, we show that the quenched free energy of this model is the functional generator of the moments of the correlation matrix among the weights connecting the two layers of the spin-glass (i.e., the Wishart matrix in random matrix theory or the Hebbian coupling in neural networks): as weights are quenched stochastic variables, this plays as a novel tool to inspect random matrices. In particular, we find that the Stieltjes transform of the spectral density of the correlation matrix is determined by the (replica-symmetric) quenched free energy of the bipartite spin-glass model. In this setup, we re-obtain the Marchenko-Pastur law in a very simple way.
△ Less
Submitted 20 November, 2018;
originally announced November 2018.