subscribe to arXiv mailings

doi 10.1126/sciadv.adp2426

The Ni isotopic composition of Ryugu reveals a common accretion region for carbonaceous chondrites

Authors: Fridolin Spitzer, Thorsten Kleine, Christoph Burkhardt, Timo Hopp, Tetsuya Yokoyama, Yoshinari Abe, Jérôme Aléon, Conel M. O'D. Alexander, Sachiko Amari, Yuri Amelin, Ken-ichi Bajo, Martin Bizzarro, Audrey Bouvier, Richard W. Carlson, Marc Chaussidon, Byeon-Gak Choi, Nicolas Dauphas, Andrew M. Davis, Tommaso Di Rocco, Wataru Fujiya, Ryota Fukai, Ikshu Gautam, Makiko K. Haba, Yuki Hibiya, Hiroshi Hidaka , et al. (66 additional authors not shown)

Abstract: The isotopic compositions of samples returned from Cb-type asteroid Ryugu and Ivuna-type (CI) chondrites are distinct from other carbonaceous chondrites, which has led to the suggestion that Ryugu and CI chondrites formed in a different region of the accretion disk, possibly around the orbits of Uranus and Neptune. We show that, like for Fe, Ryugu and CI chondrites also have indistinguishable Ni i… ▽ More The isotopic compositions of samples returned from Cb-type asteroid Ryugu and Ivuna-type (CI) chondrites are distinct from other carbonaceous chondrites, which has led to the suggestion that Ryugu and CI chondrites formed in a different region of the accretion disk, possibly around the orbits of Uranus and Neptune. We show that, like for Fe, Ryugu and CI chondrites also have indistinguishable Ni isotope anomalies, which differ from those of other carbonaceous chondrites. We propose that this unique Fe and Ni isotopic composition reflects different accretion efficiencies of small FeNi metal grains among the carbonaceous chondrite parent bodies. The CI chondrites incorporated these grains more efficiently, possibly because they formed at the end of the disk's lifetime, when planetesimal formation was also triggered by photoevaporation of the disk. Isotopic variations among carbonaceous chondrites may thus reflect fractionation of distinct dust components from a common reservoir, implying CI chondrites and Ryugu may have formed in the same region of the accretion disk as other carbonaceous chondrites. △ Less

Submitted 5 October, 2024; originally announced October 2024.

Comments: Published open access in Science Advances

Journal ref: Science Advances 10, 39, eadp2426 (2024)

arXiv:2307.12508 [pdf, ps, other]

Information Geometry of Wasserstein Statistics on Shapes and Affine Deformations

Authors: Shun-ichi Amari, Takeru Matsuda

Abstract: Information geometry and Wasserstein geometry are two main structures introduced in a manifold of probability distributions, and they capture its different characteristics. We study characteristics of Wasserstein geometry in the framework of Li and Zhao (2023) for the affine deformation statistical model, which is a multi-dimensional generalization of the location-scale model. We compare merits an… ▽ More Information geometry and Wasserstein geometry are two main structures introduced in a manifold of probability distributions, and they capture its different characteristics. We study characteristics of Wasserstein geometry in the framework of Li and Zhao (2023) for the affine deformation statistical model, which is a multi-dimensional generalization of the location-scale model. We compare merits and demerits of estimators based on information geometry and Wasserstein geometry. The shape of a probability distribution and its affine deformation are separated in the Wasserstein geometry, showing its robustness against the waveform perturbation in exchange for the loss in Fisher efficiency. We show that the Wasserstein estimator is the moment estimator in the case of the elliptically symmetric affine deformation model. It coincides with the information-geometrical estimator (maximum-likelihood estimator) when the waveform is Gaussian. The role of the Wasserstein efficiency is elucidated in terms of robustness against waveform change. △ Less

Submitted 25 June, 2024; v1 submitted 23 July, 2023; originally announced July 2023.

arXiv:2305.08869 [pdf, other]

Singular Azimuthally Propagating Electromagnetic Fields

Authors: Mustafa Bakr, Smain Amari

Abstract: We study the characteristics of azimuthally propagating electromagnetic fields in a cylindrical cavity. It is found that under certain conditions, the transverse components of the electromagnetic field are singular at the center of the cavity but the corresponding electromagnetic field remains of finite energy. The solutions are arranged in branches each of which starts from a root of $J_1(x)=0$ f… ▽ More We study the characteristics of azimuthally propagating electromagnetic fields in a cylindrical cavity. It is found that under certain conditions, the transverse components of the electromagnetic field are singular at the center of the cavity but the corresponding electromagnetic field remains of finite energy. The solutions are arranged in branches each of which starts from a root of $J_1(x)=0$ for the TE modes and a root of $J_0(x)=0$ for the TM modes. The lowest (dominant) branch starts from a resonance that corresponds to the solution $x=0$ of $J_1(x)=0$. Its energy has a logarithmic singularity in a lossless structure. The singular solutions with finite energy can be observed experimentally by forcing them to resonate in a cavity with inserted metallic wedges. They can also be excited by transient sources. The singular electromagnetic field of these waves is strong enough to ionize the air. Whether these transient singular fields can initiate lightning, a phenomenon that is still not understood, is a very interesting question. It is also worth investigating whether the lowest resonance is excited in violently energetic cosmological phenomena such as cosmic jets. △ Less

Submitted 11 May, 2023; originally announced May 2023.

arXiv:2208.07976 [pdf]

doi 10.3847/2041-8213/ac83bd

Presolar stardust in asteroid Ryugu

Authors: Jens Barosch, Larry R. Nittler, Jianhua Wang, Conel M. O'D. Alexander, Bradley T. De Gregorio, Cécile Engrand, Yoko Kebukawa, Kazuhide Nagashima, Rhonda M. Stroud, Hikaru Yabuta, Yoshinari Abe, Jérôme Aléon, Sachiko Amari, Yuri Amelin, Ken-ichi Bajo, Laure Bejach, Martin Bizzarro, Lydie Bonal, Audrey Bouvier, Richard W. Carlson, Marc Chaussidon, Byeon-Gak Choi, George D. Cody, Emmanuel Dartois, Nicolas Dauphas , et al. (99 additional authors not shown)

Abstract: We have conducted a NanoSIMS-based search for presolar material in samples recently returned from C-type asteroid Ryugu as part of JAXA's Hayabusa2 mission. We report the detection of all major presolar grain types with O- and C-anomalous isotopic compositions typically identified in carbonaceous chondrite meteorites: 1 silicate, 1 oxide, 1 O-anomalous supernova grain of ambiguous phase, 38 SiC, a… ▽ More We have conducted a NanoSIMS-based search for presolar material in samples recently returned from C-type asteroid Ryugu as part of JAXA's Hayabusa2 mission. We report the detection of all major presolar grain types with O- and C-anomalous isotopic compositions typically identified in carbonaceous chondrite meteorites: 1 silicate, 1 oxide, 1 O-anomalous supernova grain of ambiguous phase, 38 SiC, and 16 carbonaceous grains. At least two of the carbonaceous grains are presolar graphites, whereas several grains with moderate C isotopic anomalies are probably organics. The presolar silicate was located in a clast with a less altered lithology than the typical extensively aqueously altered Ryugu matrix. The matrix-normalized presolar grain abundances in Ryugu are 4.8$^{+4.7}_{-2.6}$ ppm for O-anomalous grains, 25$^{+6}_{-5}$ ppm for SiC grains and 11$^{+5}_{-3}$ ppm for carbonaceous grains. Ryugu is isotopically and petrologically similar to carbonaceous Ivuna-type (CI) chondrites. To compare the in situ presolar grain abundances of Ryugu with CI chondrites, we also mapped Ivuna and Orgueil samples and found a total of SiC grains and 6 carbonaceous grains. No O-anomalous grains were detected. The matrix-normalized presolar grain abundances in the CI chondrites are similar to those in Ryugu: 23 $^{+7}_{-6}$ ppm SiC and 9.0$^{+5.3}_{-4.6}$ ppm carbonaceous grains. Thus, our results provide further evidence in support of the Ryugu-CI connection. They also reveal intriguing hints of small-scale heterogeneities in the Ryugu samples, such as locally distinct degrees of alteration that allowed the preservation of delicate presolar material. △ Less

Submitted 16 August, 2022; originally announced August 2022.

Comments: 12 pages, 3 figures, 2 tables. Published in ApJL

Journal ref: 2022, The Astrophysical Journal Letters, 935, L3 (12pp)

arXiv:2202.05254 [pdf, other]

Deep Learning in Random Neural Fields: Numerical Experiments via Neural Tangent Kernel

Authors: Kaito Watanabe, Kotaro Sakamoto, Ryo Karakida, Sho Sonoda, Shun-ichi Amari

Abstract: A biological neural network in the cortex forms a neural field. Neurons in the field have their own receptive fields, and connection weights between two neurons are random but highly correlated when they are in close proximity in receptive fields. In this paper, we investigate such neural fields in a multilayer architecture to investigate the supervised learning of the fields. We empirically compa… ▽ More A biological neural network in the cortex forms a neural field. Neurons in the field have their own receptive fields, and connection weights between two neurons are random but highly correlated when they are in close proximity in receptive fields. In this paper, we investigate such neural fields in a multilayer architecture to investigate the supervised learning of the fields. We empirically compare the performances of our field model with those of randomly connected deep networks. The behavior of a randomly connected network is investigated on the basis of the key idea of the neural tangent kernel regime, a recent development in the machine learning theory of over-parameterized networks; for most randomly connected neural networks, it is shown that global minima always exist in their small neighborhoods. We numerically show that this claim also holds for our neural fields. In more detail, our model has two structures: i) each neuron in a field has a continuously distributed receptive field, and ii) the initial connection weights are random but not independent, having correlations when the positions of neurons are close in each layer. We show that such a multilayer neural field is more robust than conventional models when input patterns are deformed by noise disturbances. Moreover, its generalization ability can be slightly superior to that of conventional models. △ Less

Submitted 6 January, 2023; v1 submitted 10 February, 2022; originally announced February 2022.

arXiv:2007.11401 [pdf, ps, other]

Wasserstein Statistics in One-dimensional Location-Scale Model

Authors: Shun-ichi Amari, Takeru Matsuda

Abstract: Wasserstein geometry and information geometry are two important structures to be introduced in a manifold of probability distributions. Wasserstein geometry is defined by using the transportation cost between two distributions, so it reflects the metric of the base manifold on which the distributions are defined. Information geometry is defined to be invariant under reversible transformations of t… ▽ More Wasserstein geometry and information geometry are two important structures to be introduced in a manifold of probability distributions. Wasserstein geometry is defined by using the transportation cost between two distributions, so it reflects the metric of the base manifold on which the distributions are defined. Information geometry is defined to be invariant under reversible transformations of the base space. Both have their own merits for applications. In particular, statistical inference is based upon information geometry, where the Fisher metric plays a fundamental role, whereas Wasserstein geometry is useful in computer vision and AI applications. In this study, we analyze statistical inference based on the Wasserstein geometry in the case that the base space is one-dimensional. By using the location-scale model, we further derive the W-estimator that explicitly minimizes the transportation cost from the empirical distribution to a statistical model and study its asymptotic behaviors. We show that the W-estimator is consistent and explicitly give its asymptotic distribution by using the functional delta method. The W-estimator is Fisher efficient in the Gaussian case. △ Less

Submitted 28 December, 2020; v1 submitted 21 July, 2020; originally announced July 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:2003.05479

arXiv:2006.10732 [pdf, other]

When Does Preconditioning Help or Hurt Generalization?

Authors: Shun-ichi Amari, Jimmy Ba, Roger Grosse, Xuechen Li, Atsushi Nitanda, Taiji Suzuki, Denny Wu, Ji Xu

Abstract: While second order optimizers such as natural gradient descent (NGD) often speed up optimization, their effect on generalization has been called into question. This work presents a more nuanced view on how the \textit{implicit bias} of first- and second-order methods affects the comparison of generalization properties. We provide an exact asymptotic bias-variance decomposition of the generalizatio… ▽ More While second order optimizers such as natural gradient descent (NGD) often speed up optimization, their effect on generalization has been called into question. This work presents a more nuanced view on how the \textit{implicit bias} of first- and second-order methods affects the comparison of generalization properties. We provide an exact asymptotic bias-variance decomposition of the generalization error of overparameterized ridgeless regression under a general class of preconditioner $\boldsymbol{P}$, and consider the inverse population Fisher information matrix (used in NGD) as a particular example. We determine the optimal $\boldsymbol{P}$ for both the bias and variance, and find that the relative generalization performance of different optimizers depends on the label noise and the "shape" of the signal (true parameters): when the labels are noisy, the model is misspecified, or the signal is misaligned with the features, NGD can achieve lower risk; conversely, GD generalizes better than NGD under clean labels, a well-specified model, or aligned signal. Based on this analysis, we discuss several approaches to manage the bias-variance tradeoff, and the potential benefit of interpolating between GD and NGD. We then extend our analysis to regression in the reproducing kernel Hilbert space and demonstrate that preconditioned GD can decrease the population risk faster than GD. Lastly, we empirically compare the generalization error of first- and second-order optimizers in neural network experiments, and observe robust trends matching our theoretical analysis. △ Less

Submitted 8 December, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

Comments: 42 pages

arXiv:2003.05479 [pdf, ps, other]

Wasserstein statistics in 1D location-scale model

Authors: Shun-ichi Amari

Abstract: Wasserstein geometry and information geometry are two important structures introduced in a manifold of probability distributions. The former is defined by using the transportation cost between two distributions, so it reflects the metric structure of the base manifold on which distributions are defined. Information geometry is constructed based on the invariance criterion that the geometry is inva… ▽ More Wasserstein geometry and information geometry are two important structures introduced in a manifold of probability distributions. The former is defined by using the transportation cost between two distributions, so it reflects the metric structure of the base manifold on which distributions are defined. Information geometry is constructed based on the invariance criterion that the geometry is invariant under reversible transformations of the base space. Both have their own merits for applications. Statistical inference is constructed on information geometry, where the Fisher metric plays a fundamental role, whereas Wasserstein geometry is useful for applications to computer vision and AI. We propose statistical inference based on the Wasserstein geometry in the case that the base space is 1-dimensional. By using the location-scale model, we derive the $W$-estimator explicitly and studies its asymptotic behaviors. △ Less

Submitted 5 March, 2020; originally announced March 2020.

Comments: 14 pages, 2 figures

arXiv:2002.10967 [pdf]

Evidence of presolar SiC in the Allende Curious Marie calcium aluminum rich inclusion

Authors: O. Pravdivtseva, F. L. Tissot, N. Dauphas, S. Amari

Abstract: Calcium aluminum rich inclusions (CAIs) are one of the first solids to have condensed in the solar nebula, while presolar grains formed in various evolved stellar environments. It is generally accepted that CAIs formed close to the Sun at temperatures above 1500 K, where presolar grains could not survive, and were then transported to other regions of the nebula where the accretion of planetesimals… ▽ More Calcium aluminum rich inclusions (CAIs) are one of the first solids to have condensed in the solar nebula, while presolar grains formed in various evolved stellar environments. It is generally accepted that CAIs formed close to the Sun at temperatures above 1500 K, where presolar grains could not survive, and were then transported to other regions of the nebula where the accretion of planetesimals took place. In this context, a commonly held view is that presolar grains are found solely in the fine-grained rims surrounding chondrules and in the low-temperature fine-grained matrix that binds the various meteoritic components together. Here we demonstrate, based on noble gas isotopic signatures, that presolar SiC have been incorporated into fine-grained CAIs in the Allende carbonaceous chondrite at the time of their formation, and have survived parent body processing. This finding provides new clues on the conditions in the nascent solar system at the condensation of first solids. △ Less

Submitted 6 December, 2019; originally announced February 2020.

Comments: 4 figures

arXiv:2001.06931 [pdf, ps, other]

Any Target Function Exists in a Neighborhood of Any Sufficiently Wide Random Network: A Geometrical Perspective

Authors: Shun-ichi Amari

Abstract: It is known that any target function is realized in a sufficiently small neighborhood of any randomly connected deep network, provided the width (the number of neurons in a layer) is sufficiently large. There are sophisticated theories and discussions concerning this striking fact, but rigorous theories are very complicated. We give an elementary geometrical proof by using a simple model for the p… ▽ More It is known that any target function is realized in a sufficiently small neighborhood of any randomly connected deep network, provided the width (the number of neurons in a layer) is sufficiently large. There are sophisticated theories and discussions concerning this striking fact, but rigorous theories are very complicated. We give an elementary geometrical proof by using a simple model for the purpose of elucidating its structure. We show that high-dimensional geometry plays a magical role: When we project a high-dimensional sphere of radius 1 to a low-dimensional subspace, the uniform distribution over the sphere reduces to a Gaussian distribution of negligibly small covariances. △ Less

Submitted 17 March, 2020; v1 submitted 19 January, 2020; originally announced January 2020.

arXiv:1910.05992 [pdf, other]

Pathological spectra of the Fisher information metric and its variants in deep neural networks

Authors: Ryo Karakida, Shotaro Akaho, Shun-ichi Amari

Abstract: The Fisher information matrix (FIM) plays an essential role in statistics and machine learning as a Riemannian metric tensor or a component of the Hessian matrix of loss functions. Focusing on the FIM and its variants in deep neural networks (DNNs), we reveal their characteristic scale dependence on the network width, depth and sample size when the network has random weights and is sufficiently wi… ▽ More The Fisher information matrix (FIM) plays an essential role in statistics and machine learning as a Riemannian metric tensor or a component of the Hessian matrix of loss functions. Focusing on the FIM and its variants in deep neural networks (DNNs), we reveal their characteristic scale dependence on the network width, depth and sample size when the network has random weights and is sufficiently wide. This study covers two widely-used FIMs for regression with linear output and for classification with softmax output. Both FIMs asymptotically show pathological eigenvalue spectra in the sense that a small number of eigenvalues become large outliers depending the width or sample size while the others are much smaller. It implies that the local shape of the parameter space or loss landscape is very sharp in a few specific directions while almost flat in the other directions. In particular, the softmax output disperses the outliers and makes a tail of the eigenvalue density spread from the bulk. We also show that pathological spectra appear in other variants of FIMs: one is the neural tangent kernel; another is a metric for the input signal and feature space that arises from feedforward signal propagation. Thus, we provide a unified perspective on the FIM and its variants that will lead to more quantitative understanding of learning in large-scale DNNs. △ Less

Submitted 27 September, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

Comments: 23 pages, 7 figures; v2: minor improvements, Section 3.4 added

arXiv:1906.02926 [pdf, other]

The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

Authors: Ryo Karakida, Shotaro Akaho, Shun-ichi Amari

Abstract: Normalization methods play an important role in enhancing the performance of deep learning while their theoretical understandings have been limited. To theoretically elucidate the effectiveness of normalization, we quantify the geometry of the parameter space determined by the Fisher information matrix (FIM), which also corresponds to the local shape of the loss landscape under certain conditions.… ▽ More Normalization methods play an important role in enhancing the performance of deep learning while their theoretical understandings have been limited. To theoretically elucidate the effectiveness of normalization, we quantify the geometry of the parameter space determined by the Fisher information matrix (FIM), which also corresponds to the local shape of the loss landscape under certain conditions. We analyze deep neural networks with random initialization, which is known to suffer from a pathologically sharp shape of the landscape when the network becomes sufficiently wide. We reveal that batch normalization in the last layer contributes to drastically decreasing such pathological sharpness if the width and sample number satisfy a specific condition. In contrast, it is hard for batch normalization in the middle hidden layers to alleviate pathological sharpness in many settings. We also found that layer normalization cannot alleviate pathological sharpness either. Thus, we can conclude that batch normalization in the last layer significantly contributes to decreasing the sharpness induced by the FIM. △ Less

Submitted 28 October, 2019; v1 submitted 7 June, 2019; originally announced June 2019.

Comments: To appear in NeurIPS 2019

arXiv:1810.09545 [pdf, ps, other]

doi 10.1103/PhysRevResearch.2.033048

Unified framework for the entropy production and the stochastic interaction based on information geometry

Authors: Sosuke Ito, Masafumi Oizumi, Shun-ichi Amari

Abstract: We show a relationship between the entropy production in stochastic thermodynamics and the stochastic interaction in the information integrated theory. To clarify this relationship, we newly introduce an information geometric interpretation of the entropy production for a total system and the partial entropy productions for subsystems. We show that the violation of the additivity of the entropy pr… ▽ More We show a relationship between the entropy production in stochastic thermodynamics and the stochastic interaction in the information integrated theory. To clarify this relationship, we newly introduce an information geometric interpretation of the entropy production for a total system and the partial entropy productions for subsystems. We show that the violation of the additivity of the entropy productions is related to the stochastic interaction. This framework is a thermodynamic foundation of the integrated information theory. We also show that our information geometric formalism leads to a novel expression of the entropy production related to an optimization problem minimizing the Kullback-Leibler divergence. We analytically illustrate this interpretation by using the spin model. △ Less

Submitted 6 April, 2020; v1 submitted 22 October, 2018; originally announced October 2018.

Comments: 13pages, 4 figures

Report number: Phys. Rev. Research 2, 033048 (2020)

Journal ref: Phys. Rev. Research 2, 033048 (2020)

arXiv:1810.08278 [pdf, other]

Interpolating between Optimal Transport and MMD using Sinkhorn Divergences

Authors: Jean Feydy, Thibault Séjourné, François-Xavier Vialard, Shun-ichi Amari, Alain Trouvé, Gabriel Peyré

Abstract: Comparing probability distributions is a fundamental problem in data sciences. Simple norms and divergences such as the total variation and the relative entropy only compare densities in a point-wise manner and fail to capture the geometric nature of the problem. In sharp contrast, Maximum Mean Discrepancies (MMD) and Optimal Transport distances (OT) are two classes of distances between measures t… ▽ More Comparing probability distributions is a fundamental problem in data sciences. Simple norms and divergences such as the total variation and the relative entropy only compare densities in a point-wise manner and fail to capture the geometric nature of the problem. In sharp contrast, Maximum Mean Discrepancies (MMD) and Optimal Transport distances (OT) are two classes of distances between measures that take into account the geometry of the underlying space and metrize the convergence in law. This paper studies the Sinkhorn divergences, a family of geometric divergences that interpolates between MMD and OT. Relying on a new notion of geometric entropy, we provide theoretical guarantees for these divergences: positivity, convexity and metrization of the convergence in law. On the practical side, we detail a numerical scheme that enables the large scale application of these divergences for machine learning: on the GPU, gradients of the Sinkhorn loss can be computed for batches of a million samples. △ Less

Submitted 18 October, 2018; originally announced October 2018.

Comments: 15 pages, 5 figures

MSC Class: 62

arXiv:1808.07172 [pdf, ps, other]

Fisher Information and Natural Gradient Learning of Random Deep Networks

Authors: Shun-ichi Amari, Ryo Karakida, Masafumi Oizumi

Abstract: A deep neural network is a hierarchical nonlinear model transforming input signals to output signals. Its input-output relation is considered to be stochastic, being described for a given input by a parameterized conditional probability distribution of outputs. The space of parameters consisting of weights and biases is a Riemannian manifold, where the metric is defined by the Fisher information m… ▽ More A deep neural network is a hierarchical nonlinear model transforming input signals to output signals. Its input-output relation is considered to be stochastic, being described for a given input by a parameterized conditional probability distribution of outputs. The space of parameters consisting of weights and biases is a Riemannian manifold, where the metric is defined by the Fisher information matrix. The natural gradient method uses the steepest descent direction in a Riemannian manifold, so it is effective in learning, avoiding plateaus. It requires inversion of the Fisher information matrix, however, which is practically impossible when the matrix has a huge number of dimensions. Many methods for approximating the natural gradient have therefore been introduced. The present paper uses statistical neurodynamical method to reveal the properties of the Fisher information matrix in a net of random connections under the mean field approximation. We prove that the Fisher information matrix is unit-wise block diagonal supplemented by small order terms of off-block-diagonal elements, which provides a justification for the quasi-diagonal natural gradient method by Y. Ollivier. A unitwise block-diagonal Fisher metrix reduces to the tensor product of the Fisher information matrices of single units. We further prove that the Fisher information matrix of a single unit has a simple reduced form, a sum of a diagonal matrix and a rank 2 matrix of weight-bias correlations. We obtain the inverse of Fisher information explicitly. We then have an explicit form of the natural gradient, without relying on the numerical matrix inversion, which drastically speeds up stochastic gradient learning. △ Less

Submitted 21 August, 2018; originally announced August 2018.

Comments: 22 pages, 2 figures

arXiv:1808.07169 [pdf, ps, other]

Statistical Neurodynamics of Deep Networks: Geometry of Signal Spaces

Authors: Shun-ichi Amari, Ryo Karakida, Masafumi Oizumi

Abstract: Statistical neurodynamics studies macroscopic behaviors of randomly connected neural networks. We consider a deep layered feedforward network where input signals are processed layer by layer. The manifold of input signals is embedded in a higher dimensional manifold of the next layer as a curved submanifold, provided the number of neurons is larger than that of inputs. We show geometrical features… ▽ More Statistical neurodynamics studies macroscopic behaviors of randomly connected neural networks. We consider a deep layered feedforward network where input signals are processed layer by layer. The manifold of input signals is embedded in a higher dimensional manifold of the next layer as a curved submanifold, provided the number of neurons is larger than that of inputs. We show geometrical features of the embedded manifold, proving that the manifold enlarges or shrinks locally isotropically so that it is always embedded conformally. We study the curvature of the embedded manifold. The scalar curvature converges to a constant or diverges to infinity slowly. The distance between two signals also changes, converging eventually to a stable fixed value, provided both the number of neurons in a layer and the number of layers tend to infinity. This causes a problem, since when we consider a curve in the input space, it is mapped as a continuous curve of fractal nature, but our theory contradictorily suggests that the curve eventually converges to a discrete set of equally spaced points. In reality, the numbers of neurons and layers are finite and thus, it is expected that the finite size effect causes the discrepancies between our theory and reality. We need to further study the discrepancies to understand their implications on information processing. △ Less

Submitted 21 August, 2018; originally announced August 2018.

Comments: 23 pages, 8 figures

arXiv:1806.01316 [pdf, other]

Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach

Authors: Ryo Karakida, Shotaro Akaho, Shun-ichi Amari

Abstract: The Fisher information matrix (FIM) is a fundamental quantity to represent the characteristics of a stochastic model, including deep neural networks (DNNs). The present study reveals novel statistics of FIM that are universal among a wide class of DNNs. To this end, we use random weights and large width limits, which enables us to utilize mean field theories. We investigate the asymptotic statisti… ▽ More The Fisher information matrix (FIM) is a fundamental quantity to represent the characteristics of a stochastic model, including deep neural networks (DNNs). The present study reveals novel statistics of FIM that are universal among a wide class of DNNs. To this end, we use random weights and large width limits, which enables us to utilize mean field theories. We investigate the asymptotic statistics of the FIM's eigenvalues and reveal that most of them are close to zero while the maximum eigenvalue takes a huge value. Because the landscape of the parameter space is defined by the FIM, it is locally flat in most dimensions, but strongly distorted in others. Moreover, we demonstrate the potential usage of the derived statistics in learning strategies. First, small eigenvalues that induce flatness can be connected to a norm-based capacity measure of generalization ability. Second, the maximum eigenvalue that induces the distortion enables us to quantitatively estimate an appropriately sized learning rate for gradient methods to converge. △ Less

Submitted 8 October, 2019; v1 submitted 4 June, 2018; originally announced June 2018.

Comments: Accepted at AISTATS2019. Main text: 10 pages, 2 figures. Supplementary material: 9 pages, 2 figures, typos corrected

arXiv:1709.10219 [pdf, other]

Information Geometry Connecting Wasserstein Distance and Kullback-Leibler Divergence via the Entropy-Relaxed Transportation Problem

Authors: Shun-ichi Amari, Ryo Karakida, Masafumi Oizumi

Abstract: Two geometrical structures have been extensively studied for a manifold of probability distributions. One is based on the Fisher information metric, which is invariant under reversible transformations of random variables, while the other is based on the Wasserstein distance of optimal transportation, which reflects the structure of the distance between random variables. Here, we propose a new info… ▽ More Two geometrical structures have been extensively studied for a manifold of probability distributions. One is based on the Fisher information metric, which is invariant under reversible transformations of random variables, while the other is based on the Wasserstein distance of optimal transportation, which reflects the structure of the distance between random variables. Here, we propose a new information-geometrical theory that is a unified framework connecting the Wasserstein distance and Kullback-Leibler (KL) divergence. We primarily considered a discrete case consisting of $n$ elements and studied the geometry of the probability simplex $S_{n-1}$, which is the set of all probability distributions over $n$ elements. The Wasserstein distance was introduced in $S_{n-1}$ by the optimal transportation of commodities from distribution ${\mathbf{p}}$ to distribution ${\mathbf{q}}$, where ${\mathbf{p}}$, ${\mathbf{q}} \in S_{n-1}$. We relaxed the optimal transportation by using entropy, which was introduced by Cuturi. The optimal solution was called the entropy-relaxed stochastic transportation plan. The entropy-relaxed optimal cost $C({\mathbf{p}}, {\mathbf{q}})$ was computationally much less demanding than the original Wasserstein distance but does not define a distance because it is not minimized at ${\mathbf{p}}={\mathbf{q}}$. To define a proper divergence while retaining the computational advantage, we first introduced a divergence function in the manifold $S_{n-1} \times S_{n-1}$ of optimal transportation plans. We fully explored the information geometry of the manifold of the optimal transportation plans and subsequently constructed a new one-parameter family of divergences in $S_{n-1}$ that are related to both the Wasserstein distance and the KL-divergence. △ Less

Submitted 28 September, 2017; originally announced September 2017.

arXiv:1709.02050 [pdf, ps, other]

Geometry of Information Integration

Authors: Shun-ichi Amari, Naotsugu Tsuchiya, Masafumi Oizumi

Abstract: Information geometry is used to quantify the amount of information integration within multiple terminals of a causal dynamical system. Integrated information quantifies how much information is lost when a system is split into parts and information transmission between the parts is removed. Multiple measures have been proposed as a measure of integrated information. Here, we analyze four of the pre… ▽ More Information geometry is used to quantify the amount of information integration within multiple terminals of a causal dynamical system. Integrated information quantifies how much information is lost when a system is split into parts and information transmission between the parts is removed. Multiple measures have been proposed as a measure of integrated information. Here, we analyze four of the previously proposed measures and elucidate their relations from a viewpoint of information geometry. Two of them use dually flat manifolds and the other two use curved manifolds to define a split model. We show that there are hierarchical structures among the measures. We provide explicit expressions of these measures. △ Less

Submitted 6 September, 2017; originally announced September 2017.

arXiv:1606.08310 [pdf]

doi 10.3847/0004-637X/825/2/88

Coordinated Analysis of Two Graphite Grains from the CO3.0 LAP 031117 Meteorite: First Identification of a CO Nova Graphite and a Presolar Iron Sulfide Subgrain

Authors: Pierre Haenecour, Christine Floss, Jordi Jose, Sachiko Amari, Katharina Lodders, Manavi Jadhav, Alian Wang, Frank Gyngard

Abstract: Presolar grains constitute remnants of stars that existed before the formation of the solar system. In addition to providing direct information on the materials from which the solar system formed, these grains provide ground-truth information for models of stellar evolution and nucleosynthesis. Here we report the in-situ identification of two unique presolar graphite grains from the primitive mete… ▽ More Presolar grains constitute remnants of stars that existed before the formation of the solar system. In addition to providing direct information on the materials from which the solar system formed, these grains provide ground-truth information for models of stellar evolution and nucleosynthesis. Here we report the in-situ identification of two unique presolar graphite grains from the primitive meteorite LaPaz Icefield 031117. Based on these two graphite grains, we estimate a bulk presolar graphite abundance of 5(-3)(+7) ppm in this meteorite. One of the grains (LAP-141) is characterized by an enrichment in 12C and depletions in 33,34S, and contains a small iron sulfide subgrain, representing the first unambiguous identification of presolar iron sulfide. The other grain (LAP-149) is extremely 13C-rich and 15N-poor, with one of the lowest 12C/13C ratios observed among presolar grains. Comparison of its isotopic compositions with new stellar nucleosynthesis and dust condensation models indicates an origin in the ejecta of a low-mass CO nova. Grain LAP-149 is the first putative nova grain that quantitatively best matches nova model predictions, providing the first strong evidence for graphite condensation in nova ejecta. Our discovery confirms that CO nova graphite and presolar iron sulfide contributed to the original building blocks of the solar system. △ Less

Submitted 27 June, 2016; originally announced June 2016.

Comments: Accepted for publication in The Astrophysical Journal

arXiv:1510.04455 [pdf, ps, other]

doi 10.1073/pnas.1603583113

A unified framework for information integration based on information geometry

Authors: Masafumi Oizumi, Naotsugu Tsuchiya, Shun-ichi Amari

Abstract: We propose a unified theoretical framework for quantifying spatio-temporal interactions in a stochastic dynamical system based on information geometry. In the proposed framework, the degree of interactions is quantified by the divergence between the actual probability distribution of the system and a constrained probability distribution where the interactions of interest are disconnected. This fra… ▽ More We propose a unified theoretical framework for quantifying spatio-temporal interactions in a stochastic dynamical system based on information geometry. In the proposed framework, the degree of interactions is quantified by the divergence between the actual probability distribution of the system and a constrained probability distribution where the interactions of interest are disconnected. This framework provides novel geometric interpretations of various information theoretic measures of interactions, such as mutual information, transfer entropy, and stochastic interaction in terms of how interactions are disconnected. The framework therefore provides an intuitive understanding of the relationships between the various quantities. By extending the concept of transfer entropy, we propose a novel measure of integrated information which measures causal interactions between parts of a system. Integrated information quantifies the extent to which the whole is more than the sum of the parts and can be potentially used as a biological measure of the levels of consciousness. △ Less

Submitted 15 October, 2015; originally announced October 2015.

arXiv:1505.04368 [pdf, ps, other]

doi 10.1371/journal.pcbi.1004654

Measuring integrated information from the decoding perspective

Authors: Masafumi Oizumi, Shun-ichi Amari, Toru Yanagawa, Naotaka Fujii, Naotsugu Tsuchiya

Abstract: Accumulating evidence indicates that the capacity to integrate information in the brain is a prerequisite for consciousness. Integrated Information Theory (IIT) of consciousness provides a mathematical approach to quantifying the information integrated in a system, called integrated information, $Φ$. Integrated information is defined theoretically as the amount of information a system generates as… ▽ More Accumulating evidence indicates that the capacity to integrate information in the brain is a prerequisite for consciousness. Integrated Information Theory (IIT) of consciousness provides a mathematical approach to quantifying the information integrated in a system, called integrated information, $Φ$. Integrated information is defined theoretically as the amount of information a system generates as a whole, above and beyond the sum of the amount of information its parts independently generate. IIT predicts that the amount of integrated information in the brain should reflect levels of consciousness. Empirical evaluation of this theory requires computing integrated information from neural data acquired from experiments, although difficulties with using the original measure $Φ$ precludes such computations. Although some practical measures have been previously proposed, we found that these measures fail to satisfy the theoretical requirements as a measure of integrated information. Measures of integrated information should satisfy the lower and upper bounds as follows: The lower bound of integrated information should be 0 when the system does not generate information (no information) or when the system comprises independent parts (no integration). The upper bound of integrated information is the amount of information generated by the whole system and is realized when the amount of information generated independently by its parts equals to 0. Here we derive the novel practical measure $Φ^*$ by introducing a concept of mismatched decoding developed from information theory. We show that $Φ^*$ is properly bounded from below and above, as required, as a measure of integrated information. We derive the analytical expression $Φ^*$ under the Gaussian assumption, which makes it readily applicable to experimental data. △ Less

Submitted 17 May, 2015; originally announced May 2015.

Journal ref: PLoS Comput Biol 12(1), e1004654, 2016

arXiv:1502.01513 [pdf, ps, other]

doi 10.1103/PhysRevE.91.032921

Microscopic instability in recurrent neural networks

Authors: Yuzuru Yamanaka, Shun-ichi Amari, Shigeru Shinomoto

Abstract: In a manner similar to the molecular chaos that underlies the stable thermodynamics of gases, neuronal system may exhibit microscopic instability in individual neuronal dynamics while a macroscopic order of the entire population possibly remains stable. In this study, we analyze the microscopic stability of a network of neurons whose macroscopic activity obeys stable dynamics, expressing either mo… ▽ More In a manner similar to the molecular chaos that underlies the stable thermodynamics of gases, neuronal system may exhibit microscopic instability in individual neuronal dynamics while a macroscopic order of the entire population possibly remains stable. In this study, we analyze the microscopic stability of a network of neurons whose macroscopic activity obeys stable dynamics, expressing either monostable, bistable, or periodic state. We reveal that the network exhibits a variety of dynamical states for microscopic instability residing in given stable macroscopic dynamics. The presence of a variety of dynamical states in such a simple random network implies more abundant microscopic fluctuations in real neural networks, which consist of more complex and hierarchically structured interactions. △ Less

Submitted 5 February, 2015; originally announced February 2015.

Comments: 9 pages, 12 figures

arXiv:1502.00127 [pdf, ps, other]

doi 10.1162/NECO_a_00711

Spontaneous Motion on Two-dimensional Continuous Attractors

Authors: C. C. Alan Fung, S. -I. Amari

Abstract: Attractor models are simplified models used to describe the dynamics of firing rate profiles of a pool of neurons. The firing rate profile, or the neuronal activity, is thought to carry information. Continuous attractor neural networks (CANNs) describe the neural processing of continuous information such as object position, object orientation and direction of object motion. Recently, it was found… ▽ More Attractor models are simplified models used to describe the dynamics of firing rate profiles of a pool of neurons. The firing rate profile, or the neuronal activity, is thought to carry information. Continuous attractor neural networks (CANNs) describe the neural processing of continuous information such as object position, object orientation and direction of object motion. Recently, it was found that, in one-dimensional CANNs, short-term synaptic depression can destabilize bump-shaped neuronal attractor activity profiles. In this paper, we study two-dimensional CANNs with short-term synaptic depression and with spike frequency adaptation. We found that the dynamics of CANNs with short-term synaptic depression and CANNs with spike frequency adaptation are qualitatively similar. We also found that in both kinds of CANNs the perturbative approach can be used to predict phase diagrams, dynamical variables and speed of spontaneous motion. △ Less

Submitted 31 January, 2015; originally announced February 2015.

Comments: 58 pages, 11 figures

Journal ref: Neural Computation 27 (3) 2015

arXiv:1412.7146 [pdf, other]

doi 10.3390/e17052988

Log-Determinant Divergences Revisited: Alpha--Beta and Gamma Log-Det Divergences

Authors: Andrzej Cichocki, Sergio Cruces, Shun-Ichi Amari

Abstract: In this paper, we review and extend a family of log-det divergences for symmetric positive definite (SPD) matrices and discuss their fundamental properties. We show how to generate from parameterized Alpha-Beta (AB) and Gamma Log-det divergences many well known divergences, for example, the Stein's loss, S-divergence, called also Jensen-Bregman LogDet (JBLD) divergence, the Logdet Zero (Bhattachar… ▽ More In this paper, we review and extend a family of log-det divergences for symmetric positive definite (SPD) matrices and discuss their fundamental properties. We show how to generate from parameterized Alpha-Beta (AB) and Gamma Log-det divergences many well known divergences, for example, the Stein's loss, S-divergence, called also Jensen-Bregman LogDet (JBLD) divergence, the Logdet Zero (Bhattacharryya) divergence, Affine Invariant Riemannian Metric (AIRM) as well as some new divergences. Moreover, we establish links and correspondences among many log-det divergences and display them on alpha-beta plain for various set of parameters. Furthermore, this paper bridges these divergences and shows also their links to divergences of multivariate and multiway Gaussian distributions. Closed form formulas are derived for gamma divergences of two multivariate Gaussian densities including as special cases the Kullback-Leibler, Bhattacharryya, Rényi and Cauchy-Schwartz divergences. Symmetrized versions of the log-det divergences are also discussed and reviewed. A class of divergences is extended to multiway divergences for separable covariance (precision) matrices. △ Less

Submitted 23 December, 2014; v1 submitted 18 December, 2014; originally announced December 2014.

Comments: 35 pages, 4 figures

arXiv:1410.2386 [pdf, other]

doi 10.1109/TNNLS.2015.2423694

Bayesian Robust Tensor Factorization for Incomplete Multiway Data

Authors: Qibin Zhao, Guoxu Zhou, Liqing Zhang, Andrzej Cichocki, Shun-ichi Amari

Abstract: We propose a generative model for robust tensor factorization in the presence of both missing data and outliers. The objective is to explicitly infer the underlying low-CP-rank tensor capturing the global information and a sparse tensor capturing the local information (also considered as outliers), thus providing the robust predictive distribution over missing entries. The low-CP-rank tensor is mo… ▽ More We propose a generative model for robust tensor factorization in the presence of both missing data and outliers. The objective is to explicitly infer the underlying low-CP-rank tensor capturing the global information and a sparse tensor capturing the local information (also considered as outliers), thus providing the robust predictive distribution over missing entries. The low-CP-rank tensor is modeled by multilinear interactions between multiple latent factors on which the column sparsity is enforced by a hierarchical prior, while the sparse tensor is modeled by a hierarchical view of Student-$t$ distribution that associates an individual hyperparameter with each element independently. For model learning, we develop an efficient closed-form variational inference under a fully Bayesian treatment, which can effectively prevent the overfitting problem and scales linearly with data size. In contrast to existing related works, our method can perform model selection automatically and implicitly without need of tuning parameters. More specifically, it can discover the groundtruth of CP rank and automatically adapt the sparsity inducing priors to various types of outliers. In addition, the tradeoff between the low-rank approximation and the sparse representation can be optimized in the sense of maximum model evidence. The extensive experiments and comparisons with many state-of-the-art algorithms on both synthetic and real-world datasets demonstrate the superiorities of our method from several perspectives. △ Less

Submitted 16 April, 2015; v1 submitted 9 October, 2014; originally announced October 2014.

Comments: in IEEE Transactions on Neural Networks and Learning Systems, 2015

arXiv:1312.1103 [pdf, ps, other]

Curvature of Hessian Manfiolds

Authors: Shun-ichi Amari, John Armstrong

Abstract: We prove that, in dimensions greater than 2, the generic metric is not a Hessian metric and find a curvature condition on Hessian metrics in dimensions greater than 3. In particular we prove that the forms used to define the Pontryagin classes in terms of the curvature vanish on a Hessian manifold. By contrast all analytic Riemannian 2-metrics are Hessian metrics. We prove that, in dimensions greater than 2, the generic metric is not a Hessian metric and find a curvature condition on Hessian metrics in dimensions greater than 3. In particular we prove that the forms used to define the Pontryagin classes in terms of the curvature vanish on a Hessian manifold. By contrast all analytic Riemannian 2-metrics are Hessian metrics. △ Less

Submitted 4 December, 2013; originally announced December 2013.

Comments: 18 pages

MSC Class: 53B05; 53B20

arXiv:1311.5125 [pdf, ps, other]

On conformal divergences and their population minimizers

Authors: Richard Nock, Frank Nielsen, Shun-ichi Amari

Abstract: Total Bregman divergences are a recent tweak of ordinary Bregman divergences originally motivated by applications that required invariance by rotations. They have displayed superior results compared to ordinary Bregman divergences on several clustering, computer vision, medical imaging and machine learning tasks. These preliminary results raise two important problems : First, report a complete cha… ▽ More Total Bregman divergences are a recent tweak of ordinary Bregman divergences originally motivated by applications that required invariance by rotations. They have displayed superior results compared to ordinary Bregman divergences on several clustering, computer vision, medical imaging and machine learning tasks. These preliminary results raise two important problems : First, report a complete characterization of the left and right population minimizers for this class of total Bregman divergences. Second, characterize a principled superset of total and ordinary Bregman divergences with good clustering properties, from which one could tailor the choice of a divergence to a particular application. In this paper, we provide and study one such superset with interesting geometric features, that we call conformal divergences, and focus on their left and right population minimizers. Our results are obtained in a recently coined $(u, v)$-geometric structure that is a generalization of the dually flat affine connections in information geometry. We characterize both analytically and geometrically the population minimizers. We prove that conformal divergences (resp. total Bregman divergences) are essentially exhaustive for their left (resp. right) population minimizers. We further report new results and extend previous results on the robustness to outliers of the left and right population minimizers, and discuss the role of the $(u, v)$-geometric structure in clustering. Additional results are also given. △ Less

Submitted 8 June, 2015; v1 submitted 20 November, 2013; originally announced November 2013.

arXiv:1304.7955 [pdf, ps, other]

doi 10.1088/1367-2630/15/6/063012

Achieving Precise Mechanical Control in Intrinsically Noisy Systems

Authors: Wenlian Lu, Jianfeng Feng, Shun-ichi Amari, David Waxman

Abstract: How can precise control be realised in intrinsically noisy systems? Here, we develop a general theoretical framework that provides a way to achieve precise control in signal-dependent noisy environments. When the control signal has Poisson or supra-Poisson noise, precise control is not possible. If, however, the control signal has sub-Poisson noise, then precise control is possible. For this case,… ▽ More How can precise control be realised in intrinsically noisy systems? Here, we develop a general theoretical framework that provides a way to achieve precise control in signal-dependent noisy environments. When the control signal has Poisson or supra-Poisson noise, precise control is not possible. If, however, the control signal has sub-Poisson noise, then precise control is possible. For this case, the precise control solution is not a function, but a rapidly varying random process that must be averaged with respect to a governing probability density functional. Our theoretical approach is applied to the control of straight-trajectory arm movement. Sub-Poisson noise in the control signal is shown to be capable of leading to precise control. Intriguingly, the control signal for this system has a natural counterpart, namely the bursting pulses of neurons --trains of Dirac-delta functions-- in biological systems to achieve precise control performance. △ Less

Submitted 30 April, 2013; originally announced April 2013.

Comments: 26 pages, 10 figures

MSC Class: 93E20; 93E24

arXiv:1304.6591 [pdf, ps, other]

Lp-Regularized Least Squares (0<p<1) and Critical Path

Authors: Masahiro Yukawa, Shun-ichi Amari

Abstract: The least squares problem is formulated in terms of Lp quasi-norm regularization (0<p<1). Two formulations are considered: (i) an Lp-constrained optimization and (ii) an Lp-penalized (unconstrained) optimization. Due to the nonconvexity of the Lp quasi-norm, the solution paths of the regularized least squares problem are not ensured to be continuous. A critical path, which is a maximal continuous… ▽ More The least squares problem is formulated in terms of Lp quasi-norm regularization (0<p<1). Two formulations are considered: (i) an Lp-constrained optimization and (ii) an Lp-penalized (unconstrained) optimization. Due to the nonconvexity of the Lp quasi-norm, the solution paths of the regularized least squares problem are not ensured to be continuous. A critical path, which is a maximal continuous curve consisting of critical points, is therefore considered separately. The critical paths are piecewise smooth, as can be seen from the viewpoint of the variational method, and generally contain non-optimal points such as saddle points and local maxima as well as global/local minima. Along each critical path, the correspondence between the regularization parameters (which govern the 'strength' of regularization in the two formulations) is non-monotonic and, more specifically, it has multiplicity. Two paths of critical points connecting the origin and an ordinary least squares (OLS) solution are highlighted. One is a main path starting at an OLS solution, and the other is a greedy path starting at the origin. Part of the greedy path can be constructed with a generalized Minkowskian gradient. The breakpoints of the greedy path coincide with the step-by-step solutions generated by using orthogonal matching pursuit (OMP), thereby establishing a direct link between OMP and Lp-regularized least squares. △ Less

Submitted 24 April, 2013; originally announced April 2013.

arXiv:1202.6526 [pdf, ps, other]

doi 10.1103/PhysRevE.87.022814

State Concentration Exponent as a Measure of Quickness in Kauffman-type Networks

Authors: Shun-ichi Amari, Hiroyasu Ando, Taro Toyoizumi, Naoki Masuda

Abstract: We study the dynamics of randomly connected networks composed of binary Boolean elements and those composed of binary majority vote elements. We elucidate their differences in both sparsely and densely connected cases. The quickness of large network dynamics is usually quantified by the length of transient paths, an analytically intractable measure. For discrete-time dynamics of networks of binary… ▽ More We study the dynamics of randomly connected networks composed of binary Boolean elements and those composed of binary majority vote elements. We elucidate their differences in both sparsely and densely connected cases. The quickness of large network dynamics is usually quantified by the length of transient paths, an analytically intractable measure. For discrete-time dynamics of networks of binary elements, we address this dilemma with an alternative unified framework by using a concept termed state concentration, defined as the exponent of the average number of t-step ancestors in state transition graphs. The state transition graph is defined by nodes corresponding to network states and directed links corresponding to transitions. Using this exponent, we interrogate the dynamics of random Boolean and majority vote networks. We find that extremely sparse Boolean networks and majority vote networks with arbitrary density achieve quickness, owing in part to long-tailed in-degree distributions. As a corollary, only relatively dense majority vote networks can achieve both quickness and robustness. △ Less

Submitted 4 March, 2013; v1 submitted 29 February, 2012; originally announced February 2012.

Comments: 6 figures

Journal ref: Physical Review E, 87, 022814 (2013)

arXiv:1110.4763 [pdf]

doi 10.1088/0004-637X/744/1/49

Tungsten isotopic compositions in stardust SiC grains from the Murchison meteorite: Constraints on the s-process in the Hf-Ta-W-Re-Os region

Authors: J. N. Ávila, M. Lugaro, T. R. Ireland, F. Gyngard, E. Zinner, S. Cristallo, P. Holden, J. Buntain, S. Amari, A. Karakas

Abstract: We report the first tungsten isotopic measurements in stardust silicon carbide (SiC) grains recovered from the Murchison carbonaceous chondrite. The isotopes 182W, 183W, 184W, 186W and 179Hf, 180Hf were measured on both an aggregate (KJB fraction) and single stardust SiC grains (LS+LU fraction) believed to have condensed in the outflows of low-mass carbon-rich asymptotic giant branch (AGB) stars w… ▽ More We report the first tungsten isotopic measurements in stardust silicon carbide (SiC) grains recovered from the Murchison carbonaceous chondrite. The isotopes 182W, 183W, 184W, 186W and 179Hf, 180Hf were measured on both an aggregate (KJB fraction) and single stardust SiC grains (LS+LU fraction) believed to have condensed in the outflows of low-mass carbon-rich asymptotic giant branch (AGB) stars with close-to-solar metallicity. The SiC aggregate shows small deviations from terrestrial (=solar) composition in the 182W/184W and 183W/184W ratios, with deficits in 182W and 183W with respect to 184W. The 186W/184W ratio, however, shows no apparent deviation from the solar value. Tungsten isotopic measurements in single mainstream stardust SiC grains revealed lower than solar 182W/184W, 183W/184W, and 186W/184W ratios. We have compared the SiC data with theoretical predictions of the evolution of W isotopic ratios in the envelopes of AGB stars. These ratios are affected by the slow neutron-capture process and match the SiC data regarding their 182W/184W, 183W/184W, and 179Hf/180Hf isotopic compositions, although a small adjustment in the s-process production of 183W is needed in order to have a better agreement between the SiC data and model predictions. The models cannot explain the 186W/184W ratios observed in the SiC grains, even when the current 185W neutron-capture cross section is increased by a factor of two. Further study is required to better assess how model uncertainties (e.g., the formation of the 13C neutron source, the mass-loss law, the modelling of the third dredge-up, and the efficiency of the 22Ne neutron source) may affect current s-process predictions. △ Less

Submitted 21 October, 2011; originally announced October 2011.

Comments: Accepted for Publication on The Astrophysical Journal 43 pages, 2 tables, 7 figures

arXiv:1010.4965 [pdf, ps, other]

Dually flat structure with escort probability and its application to alpha-Voronoi diagrams

Authors: Atsumi Ohara, Hiroshi Matsuzoe, Shun-ichi Amari

Abstract: This paper studies geometrical structure of the manifold of escort probability distributions and shows its new applicability to information science. In order to realize escort probabilities we use a conformal transformation that flattens so-called alpha-geometry of the space of discrete probability distributions, which well characterizes nonadditive statistics on the space. As a result escort prob… ▽ More This paper studies geometrical structure of the manifold of escort probability distributions and shows its new applicability to information science. In order to realize escort probabilities we use a conformal transformation that flattens so-called alpha-geometry of the space of discrete probability distributions, which well characterizes nonadditive statistics on the space. As a result escort probabilities are proved to be flat coordinates of the usual probabilities for the derived dually flat structure. Finally, we demonstrate that escort probabilities with the new structure admits a simple algorithm to compute Voronoi diagrams and centroids with respect to alpha-divergences. △ Less

Submitted 24 October, 2010; originally announced October 2010.

Comments: Several results in this paper can be found in the conference paper [36] without complete proofs

arXiv:1009.4516 [pdf]

doi 10.1162/NECO_a_00073

Modeling Basal Ganglia for understanding Parkinsonian Reaching Movements

Authors: K. N. Magdoom, D. Subramanian, V. S. Chakravarthy, B. Ravindran, Shun-ichi Amari, N. Meenakshisundaram

Abstract: We present a computational model that highlights the role of basal ganglia (BG) in generating simple reaching movements. The model is cast within the reinforcement learning (RL) framework with the correspondence between RL components and neuroanatomy as follows: dopamine signal of substantia nigra pars compacta as the Temporal Difference error, striatum as the substrate for the Critic, and the mot… ▽ More We present a computational model that highlights the role of basal ganglia (BG) in generating simple reaching movements. The model is cast within the reinforcement learning (RL) framework with the correspondence between RL components and neuroanatomy as follows: dopamine signal of substantia nigra pars compacta as the Temporal Difference error, striatum as the substrate for the Critic, and the motor cortex as the Actor. A key feature of this neurobiological interpretation is our hypothesis that the indirect pathway is the Explorer. Chaotic activity, originating from the indirect pathway part of the model, drives the wandering, exploratory movements of the arm. Thus the direct pathway subserves exploitation while the indirect pathway subserves exploration. The motor cortex becomes more and more independent of the corrective influence of BG, as training progresses. Reaching trajectories show diminishing variability with training. Reaching movements associated with Parkinson's disease (PD) are simulated by (a) reducing dopamine and (b) degrading the complexity of indirect pathway dynamics by switching it from chaotic to periodic behavior. Under the simulated PD conditions, the arm exhibits PD motor symptoms like tremor, bradykinesia and undershoot. The model echoes the notion that PD is a dynamical disease. △ Less

Submitted 23 September, 2010; originally announced September 2010.

Comments: Neural Computation, In Press

Journal ref: Neural Computation (2011), 23(2), 477-516

arXiv:0910.0864 [pdf]

doi 10.1071/AS08046

Presolar Diamond in Meteorites

Authors: Sachiko Amari

Abstract: Presolar diamond, the carrier of the isotopically anomalous Xe component Xe-HL, was the first mineral type of presolar dust that was isolated from meteorites. The excesses in the light, p-process only isotopes 124Xe and 126Xe, and in the heavy, r-process only isotopes 134Xe and 136Xe relative to the solar ratios indicate that Xe-HL was produced in supernovae: they are the only stellar source whe… ▽ More Presolar diamond, the carrier of the isotopically anomalous Xe component Xe-HL, was the first mineral type of presolar dust that was isolated from meteorites. The excesses in the light, p-process only isotopes 124Xe and 126Xe, and in the heavy, r-process only isotopes 134Xe and 136Xe relative to the solar ratios indicate that Xe-HL was produced in supernovae: they are the only stellar source where these two processes are believed to take place. Although these processes occur in supernovae, their physical conditions and timeframes are completely different. Yet the excesses are always correlated in diamond separates from meteorites. Furthermore, the p-process 124Xe/126Xe inferred from Xe-L and the r-process 134Xe/136Xe from Xe-H do not agree with the p-process and r-process ratios derived from the solar system abundance, and the inferred p-process ratio does not agree with those predicted from stellar models. The 'rapid separation scenario', where the separation of Xe and its radiogenic precursors Te and I takes place at the very early stage (7900 sec after the end of the r-process), has been proposed to explain Xe-H. Alternatively, mixing of 20% of material that experienced neutron burst and 80% of solar material can reproduce the pattern of Xe-H, although Xe-L is not accounted for with this scenario. △ Less

Submitted 5 October, 2009; originally announced October 2009.

Journal ref: Publ.Astron.Soc.Austral.26:266-270,2009

arXiv:0909.5532 [pdf]

doi 10.1071/AS08039

He and Ne ages of large presolar silicon carbide grains: Solving the recoil problem

Authors: U. Ott, P. R. Heck, F. Gyngard, R. Wieler, F. Wrobel, S. Amari, E. Zinner

Abstract: Knowledge about the age of presolar grains provides important insights into Galactic chemical evolution and the dynamics of grain formation and destruction processes in the Galaxy. Determination from the abundance of cosmic ray interaction products is straightforward, but in the past has suffered from uncertainties in correcting for recoil losses of spallation products. The problem is less serio… ▽ More Knowledge about the age of presolar grains provides important insights into Galactic chemical evolution and the dynamics of grain formation and destruction processes in the Galaxy. Determination from the abundance of cosmic ray interaction products is straightforward, but in the past has suffered from uncertainties in correcting for recoil losses of spallation products. The problem is less serious in a class of large (tens of micrometer) grains. We describe the correction procedure and summarise results for He and Ne ages of presolar SiC "Jumbo" grains that range from close to zero to ~850 Myr, with the majority being less than 200 Myr. We also discuss the possibility of extending our approach to the majority of smaller SiC grains and explore possible contributions from trapping of cosmic rays. △ Less

Submitted 30 September, 2009; originally announced September 2009.

Comments: Publications of the Astronomical Society of Australia, Contribution to PASA special volume "The Origin of Elements Heavier than Iron in honor of the 70th birthday of Roberto Gallino"

Journal ref: Publ.Astron.Soc.Austral.26:297-302,2009

arXiv:0909.3377 [pdf]

doi 10.1071/AS08033

Heavy Element Abundances in Presolar Silicon Carbide Grains from Low-Metallicity AGB Stars

Authors: P. Hoppe, J. Leitner, C. Vollmer, E. Groener, P. R. Heck, R. Gallino, S. Amari

Abstract: Primitive meteorites contain small amounts of presolar minerals that formed in the winds of evolved stars or in the ejecta of stellar explosions. Silicon carbide is the best studied presolar mineral. Based on its isotopic compositions it was divided into distinct populations that have different origins: Most abundant are the mainstream grains which are believed to come from 1.5-3 Msun AGB stars… ▽ More Primitive meteorites contain small amounts of presolar minerals that formed in the winds of evolved stars or in the ejecta of stellar explosions. Silicon carbide is the best studied presolar mineral. Based on its isotopic compositions it was divided into distinct populations that have different origins: Most abundant are the mainstream grains which are believed to come from 1.5-3 Msun AGB stars of roughly solar metallicitiy. The rare Y and Z grains are likely to come from 1.5-3 Msun AGB stars as well, but with subsolar metallicities (0.3-0.5x solar). Here we report on C and Si isotope and trace element (Zr, Ba) studies of individual, submicrometer-sized SiC grains. The most striking results are: (1) Zr and Ba concentrations are higher in Y and Z grains than in mainstream grains, with enrichments relative to Si and solar of up to 70x (Zr) and 170x (Ba), respectively. (2) For the Y and Z grains there is a positive correlation between Ba concentrations and amount of s-process Si. This correlation is well explained by predictions for 2-3 Msun AGB stars with metallicities of 0.3-0.5x solar. This confirms low-metallicity stars as most likely stellar sources for the Y and Z grains. △ Less

Submitted 18 September, 2009; originally announced September 2009.

Journal ref: Publ.Astron.Soc.Austral.26:284-288,2009

arXiv:cond-mat/0607506 [pdf, ps, other]

doi 10.1143/JPSJ.76.023003

Efficiency of Energy Transduction in a Molecular Chemical Engine

Authors: Kazuo Sasaki, Ryo Kanada, Satoshi Amari

Abstract: A simple model of the two-state ratchet type is proposed for molecular chemical engines that convert chemical free energy into mechanical work and vice versa. The engine works by catalyzing a chemical reaction and turning a rotor. Analytical expressions are obtained for the dependences of rotation and reaction rates on the concentrations of reactant and product molecules, from which the performa… ▽ More A simple model of the two-state ratchet type is proposed for molecular chemical engines that convert chemical free energy into mechanical work and vice versa. The engine works by catalyzing a chemical reaction and turning a rotor. Analytical expressions are obtained for the dependences of rotation and reaction rates on the concentrations of reactant and product molecules, from which the performance of the engine is analyzed. In particular, the efficiency of energy transduction is discussed in some detail. △ Less

Submitted 28 December, 2006; v1 submitted 19 July, 2006; originally announced July 2006.

Comments: 4 pages, 4 fugures; title modified, figures 2 and 3 modified, content changed (pages 1 and 4, mainly), references added

Journal ref: Journal of the Physical Society of Japan, 76, 023003 (2007)

arXiv:cond-mat/0502017 [pdf, ps, other]

doi 10.1143/JPSJ.74.2226

Diffusion Coefficient and Mobility of a Brownian Particle in a Tilted Periodic Potential

Authors: Kazuo Sasaki, Satoshi Amari

Abstract: The Brownian motion of a particle in a one-dimensional periodic potential subjected to a uniform external force F is studied. Using the formula for the diffusion coefficient D obtained by other authors and an alternative one derived from the Fokker-Planck equation in the present work, D is compared with the differential mobility μ= dv/dF where v is the average velocity of the particle. Analytica… ▽ More The Brownian motion of a particle in a one-dimensional periodic potential subjected to a uniform external force F is studied. Using the formula for the diffusion coefficient D obtained by other authors and an alternative one derived from the Fokker-Planck equation in the present work, D is compared with the differential mobility μ= dv/dF where v is the average velocity of the particle. Analytical and numerical calculations indicate that inequality D \ge μk_{B}T, with k_{B} the Boltzmann constant and T the temperature, holds if the periodic potential is symmetric, while it is violated for asymmetric potentials when F is small but nonzero. △ Less

Submitted 1 February, 2005; originally announced February 2005.

Comments: 7 pages, 4 figures, submitted to J. Phys. Soc. Jpn

arXiv:astro-ph/0501430 [pdf]

doi 10.1016/j.chemer.2005.01.001

Presolar grains from meteorites: Remnants from the early times of the solar system

Authors: Katharina Lodders, Sachiko Amari

Abstract: This review provides an introduction to presolar grains - preserved stardust from the interstellar molecular cloud from which our solar system formed - found in primitive meteorites. We describe the search for the presolar components, the currently known presolar mineral populations, and the chemical and isotopic characteristics of the grains and dust-forming stars to identify the grains' most p… ▽ More This review provides an introduction to presolar grains - preserved stardust from the interstellar molecular cloud from which our solar system formed - found in primitive meteorites. We describe the search for the presolar components, the currently known presolar mineral populations, and the chemical and isotopic characteristics of the grains and dust-forming stars to identify the grains' most probable stellar sources. Keywords: presolar grains, interstellar dust, asymptotic giant branch (AGB) stars, novae, supernovae, nucleosynthesis, isotopic ratios, meteorites △ Less

Submitted 31 March, 2005; v1 submitted 20 January, 2005; originally announced January 2005.

Comments: 71 pages, 24 figures, 9 tables. Invited review. to appear in Chemie der Erde

arXiv:astro-ph/0405332 [pdf, ps, other]

doi 10.1086/422569

The Imprint of Nova Nucleosynthesis in Presolar Grains

Authors: Jordi Jose, Margarita Hernanz, Sachiko Amari, Katharina Lodders, Ernst Zinner

Abstract: Infrared and ultraviolet observations of nova light curves have confirmed grain formation in their expanding shells that are ejected into the interstellar medium by a thermonuclear runaway. In this paper, we present isotopic ratios of intermediate-mass elements up to silicon for the ejecta of CO and ONe novae, based on 20 hydrodynamic models of nova explosions. These theoretical estimates will h… ▽ More Infrared and ultraviolet observations of nova light curves have confirmed grain formation in their expanding shells that are ejected into the interstellar medium by a thermonuclear runaway. In this paper, we present isotopic ratios of intermediate-mass elements up to silicon for the ejecta of CO and ONe novae, based on 20 hydrodynamic models of nova explosions. These theoretical estimates will help to properly identify nova grains in primitive meteorites. In addition, equilibrium condensation calculations are used to predict the types of grains that can be expected in the nova ejecta, providing some hints on the puzzling formation of C-rich dust in O>C environments. These results show that SiC grains can condense in ONe novae, in concert with an inferred (ONe) nova origin for several presolar SiC grains. △ Less

Submitted 17 May, 2004; originally announced May 2004.

Comments: 42 pages. Accepted for publication in The Astrophysical Journal

Journal ref: Astrophys.J. 612 (2004) 414-428

arXiv:astro-ph/0202167 [pdf, ps, other]

Could SiC A+B grains have originated in a post-AGB thermal pulse?

Authors: Falk Herwig, Sachiko Amari, Maria Lugaro, Ernst Zinner

Abstract: The carbon and nitrogen isotopic ratios of pre-solar SiC grains of type A+B suggest a proton-limited nucleosynthetic process as encountered, for instance, during the very late thermal pulse of post-AGB stars. We study the nuclear processes during this phase and find carbon and nitrogen isotopic ratios which can reproduce those of A+B grains. These results are still preliminary because they depen… ▽ More The carbon and nitrogen isotopic ratios of pre-solar SiC grains of type A+B suggest a proton-limited nucleosynthetic process as encountered, for instance, during the very late thermal pulse of post-AGB stars. We study the nuclear processes during this phase and find carbon and nitrogen isotopic ratios which can reproduce those of A+B grains. These results are still preliminary because they depend on uncertain factors such as the details of mixing during the post-AGB thermal pulse, the rates of some nuclear reactions, and the assumptions on mixing during the progenitor AGB phase. △ Less

Submitted 7 February, 2002; originally announced February 2002.

Comments: 2 pages, 2 figures, poster presented at IAU Symp 209 "Planetary Nebulae"

arXiv:astro-ph/0012465 [pdf]

doi 10.1086/320235

Presolar Grains from Novae

Authors: S. Amari, X. Gao, L. N. Nittler, E. Zinner, J. Jose, M. Hernanz, R. S. Lewis

Abstract: We report the discovery of five SiC grains and one graphite grain isolated from the Murchison carbonaceous meteorite whose major-element isotopic compositions indicate an origin in nova explosions. The grains are characterized by low 12C/13C (4-9) and 14N/15N (5-20) ratios, large excesses in 30Si (30Si/28Si ratios range to 2.1 times solar) and high 26Al/27Al ratios. These isotopic signatures are… ▽ More We report the discovery of five SiC grains and one graphite grain isolated from the Murchison carbonaceous meteorite whose major-element isotopic compositions indicate an origin in nova explosions. The grains are characterized by low 12C/13C (4-9) and 14N/15N (5-20) ratios, large excesses in 30Si (30Si/28Si ratios range to 2.1 times solar) and high 26Al/27Al ratios. These isotopic signatures are theoretically predicted for the ejecta from ONe novae and cannot be matched by any other stellar sources. Previous studies of presolar grains from primitive meteorites have shown that the vast majority formed in red giant outflows and supernova ejecta. Although a classical nova origin was suggested for a few presolar graphite grains on the basis of 22Ne enrichments, this identification is somewhat ambiguous since it is based only on one trace element. Our present study presents the first evidence for nova grains on the basis of major element isotopic compositions of single grains. We also present the results of nucleosynthetic calculations of classical nova models and compare the predicted isotopic ratios with those of the grains. The comparison points toward massive ONe novae if the ejecta are mixed with material of close-to-solar composition. △ Less

Submitted 21 December, 2000; originally announced December 2000.

Comments: 20 pages, 5 figures, 1 table. ApJ, in press

arXiv:astro-ph/9908055 [pdf, ps, other]

doi 10.1086/308078

Si Isotopic Ratios in Mainstream Presolar SiC Grains Revisited

Authors: Maria Lugaro, Ernst Zinner, Roberto Gallino, Sachiko Amari

Abstract: Although mainstream SiC grains, the major group of presolar SiC grains found in meteorites, are believed to have originated in the expanding envelope of asymptotic giant branch (AGB) stars during their late carbon-rich phases, their Si isotopic ratios show a distribution that cannot be explained by nucleosynthesis in this kind of stars. Previously, this distribution has been interpreted to be th… ▽ More Although mainstream SiC grains, the major group of presolar SiC grains found in meteorites, are believed to have originated in the expanding envelope of asymptotic giant branch (AGB) stars during their late carbon-rich phases, their Si isotopic ratios show a distribution that cannot be explained by nucleosynthesis in this kind of stars. Previously, this distribution has been interpreted to be the result of contributions from many AGB stars of different ages whose initial Si isotopic ratios vary due to the Galactic chemical evolution of the Si isotopes. This paper presents a new interpretation based on local heterogeneities of the Si isotopes in the interstellar medium at the time the parent stars of the mainstream grains were born. Recently, several authors have presented inhomogeneous chemical evolution models of the Galactic disk in order to account for the well known evidence that F and G dwarfs of similar age show an intrinsic scatter in their elemental abundances. △ Less

Submitted 5 August, 1999; originally announced August 1999.

Comments: Accepted for publication by ApJ. 19 pages of text + 17 figures and 4 tables

arXiv:astro-ph/9809212 [pdf, ps, other]

A New Astrophysical Interpretation of the Si and Ti Isotopic Compositions of Mainstream SiC Grains from Primitive Meteorites

Authors: Maria Lugaro, Roberto Gallino, Maurizio Busso, Sachiko Amari, Ernst Zinner

Abstract: Mainstream presolar SiC grains from primitive meteorites show a clear s-process signature in the isotopic composition of heavy trace elements. These grains most likely condensed in the winds of a variety of AGB stars. However, the non-solar and correlated Si and Ti isotopic compositions measured in these grains are inconsistent with a pure s-signature. We present a possible solution to this much… ▽ More Mainstream presolar SiC grains from primitive meteorites show a clear s-process signature in the isotopic composition of heavy trace elements. These grains most likely condensed in the winds of a variety of AGB stars. However, the non-solar and correlated Si and Ti isotopic compositions measured in these grains are inconsistent with a pure s-signature. We present a possible solution to this much-discussed problem by assuming a spread in the original composition of parent AGB stars due to small chemical inhomogeneities in the interstellar medium at the time of their birth. These inhomogeneities may naturally arise from variations in the contributions of each nuclide to the interstellar medium by the relevant stellar nucleosynthetic sites, SNII and different subtypes of SNIa. △ Less

Submitted 16 September, 1998; originally announced September 1998.

Comments: 4 pages, 2 figures, submitted for the Proceedins of the Fifth International Symposium on Nuclei in the Cosmos (Volos, Greece, 5-12 July 1999, ed. N. Prantzos (Editions Frontieres, France), 1999

arXiv:cond-mat/9806078 [pdf, ps, other]

Mutual Information of Three-State Low Activity Diluted Neural Networks with Self-Control

Authors: D. Bolle', D. R. C. Dominguez, S. Amari

Abstract: The influence of a macroscopic time-dependent threshold on the retrieval process of three-state extremely diluted neural networks is examined. If the threshold is chosen appropriately in function of the noise and the pattern activity of the network, adapting itself in the course of the time evolution, it guarantees an autonomous functioning of the network. It is found that this self-control mech… ▽ More The influence of a macroscopic time-dependent threshold on the retrieval process of three-state extremely diluted neural networks is examined. If the threshold is chosen appropriately in function of the noise and the pattern activity of the network, adapting itself in the course of the time evolution, it guarantees an autonomous functioning of the network. It is found that this self-control mechanism considerably improves the retrieval quality, especially in the limit of low activity, including the storage capacity, the basins of attraction and the information content. The mutual information is shown to be the relevant parameter to study the retrieval quality of such low activity models. Numerical results confirm these observations. △ Less

Submitted 21 August, 2000; v1 submitted 5 June, 1998; originally announced June 1998.

Comments: Change of title and small corrections (16 pages and 6 figures)

Report number: KUL-TF-98/26

Journal ref: Neural Networks 13, 455-462 (2000)

Showing 1–46 of 46 results for author: Amari, S