-
Generalized compression and compressive search of large datasets
Authors:
Morgan E. Prior,
Thomas Howard III,
Emily Light,
Najib Ishaq,
Noah M. Daniels
Abstract:
The Big Data explosion has necessitated the development of search algorithms that scale sub-linearly in time and memory.
While compression algorithms and search algorithms do exist independently, few algorithms offer both, and those which do are domain-specific.
We present panCAKES, a novel approach to compressive search, i.e., a way to perform $k$-NN and $ρ$-NN search on compressed data while…
▽ More
The Big Data explosion has necessitated the development of search algorithms that scale sub-linearly in time and memory.
While compression algorithms and search algorithms do exist independently, few algorithms offer both, and those which do are domain-specific.
We present panCAKES, a novel approach to compressive search, i.e., a way to perform $k$-NN and $ρ$-NN search on compressed data while only decompressing a small, relevant, portion of the data.
panCAKES assumes the manifold hypothesis and leverages the low-dimensional structure of the data to compress and search it efficiently.
panCAKES is generic over any distance function for which the distance between two points is proportional to the memory cost of storing an encoding of one in terms of the other.
This property holds for many widely-used distance functions, e.g. string edit distances (Levenshtein, Needleman-Wunsch, etc.) and set dissimilarity measures (Jaccard, Dice, etc.).
We benchmark panCAKES on a variety of datasets, including genomic, proteomic, and set data.
We compare compression ratios to gzip, and search performance between the compressed and uncompressed versions of the same dataset.
panCAKES achieves compression ratios close to those of gzip, while offering sub-linear time performance for $k$-NN and $ρ$-NN search.
We conclude that panCAKES is an efficient, general-purpose algorithm for exact compressive search on large datasets that obey the manifold hypothesis.
We provide an open-source implementation of panCAKES in the Rust programming language.
△ Less
Submitted 18 September, 2024;
originally announced September 2024.
-
Learning Atoms from Crystal Structure
Authors:
Andrij Vasylenko,
Dmytro Antypov,
Sven Schewe,
Luke M. Daniels,
John B. Claridge,
Matthew S. Dyer,
Matthew J. Rosseinsky
Abstract:
Computational modelling of materials using machine learning, ML, and historical data has become integral to materials research. The efficiency of computational modelling is strongly affected by the choice of the numerical representation for describing the composition, structure and chemical elements. Structure controls the properties, but often only the composition of a candidate material is avail…
▽ More
Computational modelling of materials using machine learning, ML, and historical data has become integral to materials research. The efficiency of computational modelling is strongly affected by the choice of the numerical representation for describing the composition, structure and chemical elements. Structure controls the properties, but often only the composition of a candidate material is available. Existing elemental descriptors lack direct access to structural insights such as the coordination geometry of an element. In this study, we introduce Local Environment-induced Atomic Features, LEAFs, which incorporate information about the statistically preferred local coordination geometry for atoms in crystal structure into descriptors for chemical elements, enabling the modelling of materials solely as compositions without requiring knowledge of their crystal structure. In the crystal structure, each atomic site can be described by similarity to common local structural motifs; by aggregating these features of similarity from the experimentally verified crystal structures of inorganic materials, LEAFs formulate a set of descriptors for chemical elements and compositions. The direct connection of LEAFs to the local coordination geometry enables the analysis of ML model property predictions, linking compositions to the underlying structure-property relationships. We demonstrate the versatility of LEAFs in structure-informed property predictions for compositions, mapping of chemical space in structural terms, and prioritising elemental substitutions. Based on the latter for predicting crystal structures of binary ionic compounds, LEAFs achieve the state-of-the-art accuracy of 86 per cent. These results suggest that the structurally informed description of chemical elements and compositions developed in this work can effectively guide synthetic efforts in discovering new materials.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Layer Ensemble Averaging for Improving Memristor-Based Artificial Neural Network Performance
Authors:
Osama Yousuf,
Brian Hoskins,
Karthick Ramu,
Mitchell Fream,
William A. Borders,
Advait Madhavan,
Matthew W. Daniels,
Andrew Dienstfrey,
Jabez J. McClelland,
Martin Lueker-Boden,
Gina C. Adam
Abstract:
Artificial neural networks have advanced due to scaling dimensions, but conventional computing faces inefficiency due to the von Neumann bottleneck. In-memory computation architectures, like memristors, offer promise but face challenges due to hardware non-idealities. This work proposes and experimentally demonstrates layer ensemble averaging, a technique to map pre-trained neural network solution…
▽ More
Artificial neural networks have advanced due to scaling dimensions, but conventional computing faces inefficiency due to the von Neumann bottleneck. In-memory computation architectures, like memristors, offer promise but face challenges due to hardware non-idealities. This work proposes and experimentally demonstrates layer ensemble averaging, a technique to map pre-trained neural network solutions from software to defective hardware crossbars of emerging memory devices and reliably attain near-software performance on inference. The approach is investigated using a custom 20,000-device hardware prototyping platform on a continual learning problem where a network must learn new tasks without catastrophically forgetting previously learned information. Results demonstrate that by trading off the number of devices required for layer mapping, layer ensemble averaging can reliably boost defective memristive network performance up to the software baseline. For the investigated problem, the average multi-task classification accuracy improves from 61 % to 72 % (< 1 % of software baseline) using the proposed approach.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Measurement-driven Langevin modeling of superparamagnetic tunnel junctions
Authors:
Liam A. Pocher,
Temitayo N. Adeyeye,
Sidra Gibeault,
Philippe Talatchian,
Ursula Ebels,
Daniel P. Lathrop,
Jabez J. McClelland,
Mark D. Stiles,
Advait Madhavan,
Matthew W. Daniels
Abstract:
Superparamagnetic tunnel junctions are important devices for a range of emerging technologies, but most existing compact models capture only their mean switching rates. Capturing qualitatively accurate analog dynamics of these devices will be important as the technology scales up. Here we present results using a one-dimensional overdamped Langevin equation that captures statistical properties of m…
▽ More
Superparamagnetic tunnel junctions are important devices for a range of emerging technologies, but most existing compact models capture only their mean switching rates. Capturing qualitatively accurate analog dynamics of these devices will be important as the technology scales up. Here we present results using a one-dimensional overdamped Langevin equation that captures statistical properties of measured time traces, including voltage histograms, drift and diffusion characteristics as measured with Kramers-Moyal coefficients, and dwell times distributions. While common macrospin models are more physically-motivated magnetic models than the Langevin model, we show that for the device measured here, they capture even fewer of the measured experimental behaviors.
△ Less
Submitted 2 July, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
Programmable electrical coupling between stochastic magnetic tunnel junctions
Authors:
Sidra Gibeault,
Temitayo N. Adeyeye,
Liam A. Pocher,
Daniel P. Lathrop,
Matthew W. Daniels,
Mark D. Stiles,
Jabez J. McClelland,
William A. Borders,
Jason T. Ryan,
Philippe Talatchian,
Ursula Ebels,
Advait Madhavan
Abstract:
Superparamagnetic tunnel junctions (SMTJs) are promising sources of randomness for compact and energy efficient implementations of probabilistic computing techniques. Augmenting an SMTJ with electronic circuits, to convert the random telegraph fluctuations of its resistance state to stochastic digital signals, gives a basic building block known as a probabilistic bit or $p$-bit. Though scalable pr…
▽ More
Superparamagnetic tunnel junctions (SMTJs) are promising sources of randomness for compact and energy efficient implementations of probabilistic computing techniques. Augmenting an SMTJ with electronic circuits, to convert the random telegraph fluctuations of its resistance state to stochastic digital signals, gives a basic building block known as a probabilistic bit or $p$-bit. Though scalable probabilistic computing methods connecting $p$-bits have been proposed, practical implementations are limited by either minimal tunability or energy inefficient microprocessors-in-the-loop. In this work, we experimentally demonstrate the functionality of a scalable analog unit cell, namely a pair of $p$-bits with programmable electrical coupling. This tunable coupling is implemented with operational amplifier circuits that have a time constant of approximately 1us, which is faster than the mean dwell times of the SMTJs over most of the operating range. Programmability enables flexibility, allowing both positive and negative couplings, as well as coupling devices with widely varying device properties. These tunable coupling circuits can achieve the whole range of correlations from $-1$ to $1$, for both devices with similar timescales, and devices whose time scales vary by an order of magnitude. This range of correlation allows such circuits to be used for scalable implementations of simulated annealing with probabilistic computing.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Measurement-driven neural-network training for integrated magnetic tunnel junction arrays
Authors:
William A. Borders,
Advait Madhavan,
Matthew W. Daniels,
Vasileia Georgiou,
Martin Lueker-Boden,
Tiffany S. Santos,
Patrick M. Braganca,
Mark D. Stiles,
Jabez J. McClelland,
Brian D. Hoskins
Abstract:
The increasing scale of neural networks needed to support more complex applications has led to an increasing requirement for area- and energy-efficient hardware. One route to meeting the budget for these applications is to circumvent the von Neumann bottleneck by performing computation in or near memory. An inevitability of transferring neural networks onto hardware is that non-idealities such as…
▽ More
The increasing scale of neural networks needed to support more complex applications has led to an increasing requirement for area- and energy-efficient hardware. One route to meeting the budget for these applications is to circumvent the von Neumann bottleneck by performing computation in or near memory. An inevitability of transferring neural networks onto hardware is that non-idealities such as device-to-device variations or poor device yield impact performance. Methods such as hardware-aware training, where substrate non-idealities are incorporated during network training, are one way to recover performance at the cost of solution generality. In this work, we demonstrate inference on hardware neural networks consisting of 20,000 magnetic tunnel junction arrays integrated on a complementary metal-oxide-semiconductor chips that closely resembles market-ready spin transfer-torque magnetoresistive random access memory technology. Using 36 dies, each containing a crossbar array with its own non-idealities, we show that even a small number of defects in physically mapped networks significantly degrades the performance of networks trained without defects and show that, at the cost of generality, hardware-aware training accounting for specific defects on each die can recover to comparable performance with ideal networks. We then demonstrate a robust training method that extends hardware-aware training to statistics-aware training, producing network weights that perform well on most defective dies regardless of their specific defect locations. When evaluated on the 36 physical dies, statistics-aware trained solutions can achieve a mean misclassification error on the MNIST dataset that differs from the software-baseline by only 2 %. This statistics-aware training method could be generalized to networks with many layers that are mapped to hardware suited for industry-ready applications.
△ Less
Submitted 14 May, 2024; v1 submitted 11 December, 2023;
originally announced December 2023.
-
Quasi-free-standing AA-stacked bilayer graphene induced by calcium intercalation of the graphene-silicon carbide interface
Authors:
Antonija Grubišić-Čabo,
Jimmy C. Kotsakidis,
Yuefeng Yin,
Anton Tadich,
Matthew Haldon,
Sean Solari,
John Riley,
Eric Huwald,
Kevin M. Daniels,
Rachael L. Myers-Ward,
Mark T. Edmonds,
Nikhil Medhekar,
D. Kurt Gaskill,
Michael S. Fuhrer
Abstract:
We study quasi-freestanding bilayer graphene on silicon carbide intercalated by calcium. The intercalation, and subsequent changes to the system, were investigated by low-energy electron diffraction, angle-resolved photoemission spectroscopy (ARPES) and density-functional theory (DFT). Calcium is found to intercalate only at the graphene-SiC interface, completely displacing the hydrogen terminatin…
▽ More
We study quasi-freestanding bilayer graphene on silicon carbide intercalated by calcium. The intercalation, and subsequent changes to the system, were investigated by low-energy electron diffraction, angle-resolved photoemission spectroscopy (ARPES) and density-functional theory (DFT). Calcium is found to intercalate only at the graphene-SiC interface, completely displacing the hydrogen terminating SiC. As a consequence, the system becomes highly n-doped. Comparison to DFT calculations shows that the band dispersion, as determined by ARPES, deviates from the band structure expected for Bernal-stacked bilayer graphene. Instead, the electronic structure closely matches AA-stacked bilayer graphene on Ca-terminated SiC, indicating a spontaneous transition from AB- to AA-stacked bilayer graphene following calcium intercalation of the underlying graphene-SiC interface.
△ Less
Submitted 4 November, 2023;
originally announced November 2023.
-
Long-term memory effects of an incremental blood pressure intervention in a mortal cohort
Authors:
Maria Josefsson,
Nina Karalija,
Michael Daniels
Abstract:
In the present study we investigate overall population effects on episodic memory of an intervention over 15 years that reduces systolic blood pressure in individuals with hypertension. A limitation with previous research on the potential risk reduction of such interventions is that they do not properly account for the reduction of mortality rates. Hence, one can only speculate whether the effect…
▽ More
In the present study we investigate overall population effects on episodic memory of an intervention over 15 years that reduces systolic blood pressure in individuals with hypertension. A limitation with previous research on the potential risk reduction of such interventions is that they do not properly account for the reduction of mortality rates. Hence, one can only speculate whether the effect is due to changes in memory or changes in mortality. Therefore, we extend previous research by providing both an etiological and a prognostic effect estimate. To do this, we propose a Bayesian semi-parametric estimation approach for an incremental threshold intervention, using the extended G-formula. Additionally, we introduce a novel sparsity-inducing Dirichlet hyperprior for longitudinal data, that exploits the longitudinal structure of the data. We demonstrate the usefulness of our approach in simulations, and compare its performance to other Bayesian decision tree ensemble approaches. In our analysis of the data from the Betula cohort, we found no significant prognostic or etiological effects across all ages. This suggests that systolic blood pressure interventions likely do not strongly affect memory, whether at the overall population level or in the population that would survive under both the natural course and the intervention (the always survivor stratum).
△ Less
Submitted 20 March, 2024; v1 submitted 1 November, 2023;
originally announced November 2023.
-
Let them have CAKES: A Cutting-Edge Algorithm for Scalable, Efficient, and Exact Search on Big Data
Authors:
Morgan E. Prior,
Thomas J. Howard III,
Oliver McLaughlin,
Terrence Ferguson,
Najib Ishaq,
Noah M. Daniels
Abstract:
The ongoing Big Data explosion has created a demand for efficient and scalable algorithms for similarity search.
Most recent work has focused on \textit{approximate} $k$-NN search, and while this may be sufficient for some applications, \textit{exact} $k$-NN search would be ideal for many applications.
We present CAKES, a set of three novel, exact algorithms for $k$-NN search.
CAKES's algori…
▽ More
The ongoing Big Data explosion has created a demand for efficient and scalable algorithms for similarity search.
Most recent work has focused on \textit{approximate} $k$-NN search, and while this may be sufficient for some applications, \textit{exact} $k$-NN search would be ideal for many applications.
We present CAKES, a set of three novel, exact algorithms for $k$-NN search.
CAKES's algorithms are generic over \textit{any} distance function, and they \textit{do not} scale with the cardinality or embedding dimension of the dataset, but rather with its metric entropy and fractal dimension.
We test these claims on datasets from the ANN-Benchmarks suite under commonly-used distance functions, as well as on a genomic dataset with Levenshtein distance and a radio-frequency dataset with Dynamic Time Warping distance.
We demonstrate that CAKES exhibits near-constant scaling with cardinality on data conforming to the manifold hypothesis, and has perfect recall on data in \textit{metric} spaces.
We also demonstrate that CAKES exhibits significantly higher recall than state-of-the-art $k$-NN search algorithms when the distance function is not a metric.
Additionally, we show that indexing and tuning time for CAKES is an order of magnitude, or more, faster than state-of-the-art approaches.
We conclude that CAKES is a highly efficient and scalable algorithm for exact $k$-NN search on Big Data.
We provide a Rust implementation of CAKES.
△ Less
Submitted 9 January, 2024; v1 submitted 11 September, 2023;
originally announced September 2023.
-
Increasing the Rate of Magnesium Intercalation Underneath Epitaxial Graphene on 6H-SiC(0001)
Authors:
Jimmy C. Kotsakidis,
Marc Currie,
Antonija Grubišić-Čabo,
Anton Tadich,
Rachael L. Myers-Ward,
Matthew DeJarld,
Kevin M. Daniels,
Chang Liu,
Mark T. Edmonds,
Amadeo L. Vázquez de Parga,
Michael S. Fuhrer,
D. Kurt Gaskill
Abstract:
Magnesium intercalated 'quasi-freestanding' bilayer graphene on 6H-SiC(0001) (Mg-QFSBLG) has many favorable properties (e.g., highly n-type doped, relatively stable in ambient conditions). However, intercalation of Mg underneath monolayer graphene is challenging, requiring multiple intercalation steps. Here, we overcome these challenges and subsequently increase the rate of Mg intercalation by las…
▽ More
Magnesium intercalated 'quasi-freestanding' bilayer graphene on 6H-SiC(0001) (Mg-QFSBLG) has many favorable properties (e.g., highly n-type doped, relatively stable in ambient conditions). However, intercalation of Mg underneath monolayer graphene is challenging, requiring multiple intercalation steps. Here, we overcome these challenges and subsequently increase the rate of Mg intercalation by laser patterning (ablating) the graphene to form micron-sized discontinuities. We then use low energy electron diffraction to verify Mg-intercalation and conversion to Mg-QFSBLG, and X-ray photoelectron spectroscopy to determine the Mg intercalation rate for patterned and non-patterned samples. By modeling Mg intercalation with the Verhulst equation, we find that the intercalation rate increase for the patterned sample is 4.5$\pm$1.7. Since the edge length of the patterned sample is $\approx$5.2 times that of the non-patterned sample, the model implies that the increased intercalation rate is proportional to the increase in edge length. Moreover, Mg intercalation likely begins at graphene discontinuities in pristine samples (not step edges or flat terraces), where the 2D-like crystal growth of Mg-silicide proceeds. Our laser patterning technique may enable the rapid intercalation of other atomic or molecular species, thereby expanding upon the library of intercalants used to modify the characteristics of graphene, or other 2D materials and heterostructures.
△ Less
Submitted 27 July, 2023;
originally announced July 2023.
-
Strong transient magnetic fields induced by THz-driven plasmons in graphene disks
Authors:
Jeong Woo Han,
Pavlo Sai,
Dmytro But,
Ece Uykur,
Stephan Winnerl,
Gagan Kumar,
Matthew L. Chin,
Rachael L. Myers-Ward,
Matthew T. Dejarld,
Kevin M. Daniels,
Thomas E. Murphy,
Wojciech Knap,
Martin Mittendorff
Abstract:
Strong circularly polarized excitation opens up the possibility to generate and control effective magnetic fields in solid state systems, e.g., via the optical inverse Faraday effect or the phonon inverse Faraday effect. While these effects rely on material properties that can be tailored only to a limited degree, plasmonic resonances can be fully controlled by choosing proper dimensions and carri…
▽ More
Strong circularly polarized excitation opens up the possibility to generate and control effective magnetic fields in solid state systems, e.g., via the optical inverse Faraday effect or the phonon inverse Faraday effect. While these effects rely on material properties that can be tailored only to a limited degree, plasmonic resonances can be fully controlled by choosing proper dimensions and carrier concentrations. Plasmon resonances provide new degrees of freedom that can be used to tune or enhance the light-induced magnetic field in engineered metamaterials. Here we employ graphene disks to demonstrate light-induced transient magnetic fields from a plasmonic circular current with extremely high efficiency. The effective magnetic field at the plasmon resonance frequency of the graphene disks (3.5 THz) is evidenced by a strong (~1°) ultrafast Faraday rotation (~ 20 ps). In accordance with reference measurements and simulations, we estimated the strength of the induced magnetic field to be on the order of 0.7 T under a moderate pump fluence of about 440 nJ cm-2.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
Dirichlet process mixture models for the Analysis of Repeated Attempt Designs
Authors:
Michael J. Daniels,
Minji Lee,
Wei Feng
Abstract:
In longitudinal studies, it is not uncommon to make multiple attempts to collect a measurement after baseline. Recording whether these attempts are successful provides useful information for the purposes of assessing missing data assumptions. This is because measurements from subjects who provide the data after numerous failed attempts may differ from those who provide the measurement after fewer…
▽ More
In longitudinal studies, it is not uncommon to make multiple attempts to collect a measurement after baseline. Recording whether these attempts are successful provides useful information for the purposes of assessing missing data assumptions. This is because measurements from subjects who provide the data after numerous failed attempts may differ from those who provide the measurement after fewer attempts. Previous models for these designs were parametric and/or did not allow sensitivity analysis. For the former, there are always concerns about model misspecification and for the latter, sensitivity analysis is essential when conducting inference in the presence of missing data. Here, we propose a new approach which minimizes issues with model misspecification by using Bayesian nonparametrics for the observed data distribution. We also introduce a novel approach for identification and sensitivity analysis. We re-analyze the repeated attempts data from a clinical trial involving patients with severe mental illness and conduct simulations to better understand the properties of our approach.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
A Bayesian Non-parametric Approach for Causal Mediation with a Post-treatment Confounder
Authors:
Woojung Bae,
Michael J. Daniels,
Michael G. Perri
Abstract:
We propose a new Bayesian non-parametric (BNP) method for estimating the causal effects of mediation in the presence of a post-treatment confounder. We specify an enriched Dirichlet process mixture (EDPM) to model the joint distribution of the observed data (outcome, mediator, post-treatment confounders, treatment, and baseline confounders). The proposed BNP model allows more confounder-based clus…
▽ More
We propose a new Bayesian non-parametric (BNP) method for estimating the causal effects of mediation in the presence of a post-treatment confounder. We specify an enriched Dirichlet process mixture (EDPM) to model the joint distribution of the observed data (outcome, mediator, post-treatment confounders, treatment, and baseline confounders). The proposed BNP model allows more confounder-based clusters than clusters for the outcome and mediator. For identifiability, we use the extended version of the standard sequential ignorability as introduced in \citet{hong2022posttreatment}. The observed data model and causal identification assumptions enable us to estimate and identify the causal effects of mediation, $i.e.$, the natural direct effects (NDE), and indirect effects (NIE). We conduct simulation studies to assess the performance of our proposed method. Furthermore, we apply this approach to evaluate the causal mediation effect in the Rural LITE trial, demonstrating its practical utility in real-world scenarios. \keywords{Causal inference; Enriched Dirichlet process mixture model.}
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
Truncation Approximation for Enriched Dirichlet Process Mixture Models
Authors:
Natalie Burns,
Michael J. Daniels
Abstract:
Enriched Dirichlet process mixture (EDPM) models are Bayesian nonparametric models which can be used for nonparametric regression and conditional density estimation and which overcome a key disadvantage of jointly modeling the response and predictors as a Dirichlet process mixture (DPM) model: when there is a large number of predictors, the clusters induced by the DPM will be overwhelmingly determ…
▽ More
Enriched Dirichlet process mixture (EDPM) models are Bayesian nonparametric models which can be used for nonparametric regression and conditional density estimation and which overcome a key disadvantage of jointly modeling the response and predictors as a Dirichlet process mixture (DPM) model: when there is a large number of predictors, the clusters induced by the DPM will be overwhelmingly determined by the predictors rather than the response. A truncation approximation to a DPM allows a blocked Gibbs sampling algorithm to be used rather than a Polya urn sampling algorithm. The blocked Gibbs sampler offers potential improvement in mixing. The truncation approximation also allows for implementation in standard software ($\textit{rjags}$ and $\textit{rstan}$). In this paper we introduce an analogous truncation approximation for an EDPM. We show that with sufficiently large truncation values in the approximation of the EDP prior, a precise approximation to the EDP is available. We verify that the truncation approximation and blocked Gibbs sampler with minimum truncation values that obtain adequate error bounds achieve similar accuracy to the truncation approximation and blocked Gibbs sampler with large truncation values using a simulated example. Further, we use the simulated example to show that the blocked Gibbs sampler improves upon the mixing in the Polya urn sampler, especially as the number of covariates increases.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
Strain-programmable van der Waals magnetic tunnel junctions
Authors:
John Cenker,
Dmitry Ovchinnikov,
Harvey Yang,
Daniel G. Chica,
Catherine Zhu,
Jiaqi Cai,
Geoffrey Diederich,
Zhaoyu Liu,
Xiaoyang Zhu,
Xavier Roy,
Ting Cao,
Matthew W. Daniels,
Jiun-Haw Chu,
Di Xiao,
Xiaodong Xu
Abstract:
The magnetic tunnel junction (MTJ) is a backbone device for spintronics. Realizing next generation energy efficient MTJs will require operating mechanisms beyond the standard means of applying magnetic fields or large electrical currents. Here, we demonstrate a new concept for programmable MTJ operation via strain control of the magnetic states of CrSBr, a layered antiferromagnetic semiconductor u…
▽ More
The magnetic tunnel junction (MTJ) is a backbone device for spintronics. Realizing next generation energy efficient MTJs will require operating mechanisms beyond the standard means of applying magnetic fields or large electrical currents. Here, we demonstrate a new concept for programmable MTJ operation via strain control of the magnetic states of CrSBr, a layered antiferromagnetic semiconductor used as the tunnel barrier. Switching the CrSBr from antiferromagnetic to ferromagnetic order generates a giant tunneling magnetoresistance ratio without external magnetic field at temperatures up to ~ 140 K. When the static strain is set near the phase transition, applying small strain pulses leads to active flipping of layer magnetization with controlled layer number and thus magnetoresistance states. Further, finely adjusting the static strain to a critical value turns on stochastic switching between metastable states, with a strain-tunable sigmoidal response curve akin to the stochastic binary neuron. Our results highlight the potential of strain-programmable van der Waals MTJs towards spintronic applications, such as magnetic memory, random number generation, and probabilistic and neuromorphic computing.
△ Less
Submitted 9 January, 2023;
originally announced January 2023.
-
Device Modeling Bias in ReRAM-based Neural Network Simulations
Authors:
Osama Yousuf,
Imtiaz Hossen,
Matthew W. Daniels,
Martin Lueker-Boden,
Andrew Dienstfrey,
Gina C. Adam
Abstract:
Data-driven modeling approaches such as jump tables are promising techniques to model populations of resistive random-access memory (ReRAM) or other emerging memory devices for hardware neural network simulations. As these tables rely on data interpolation, this work explores the open questions about their fidelity in relation to the stochastic device behavior they model. We study how various jump…
▽ More
Data-driven modeling approaches such as jump tables are promising techniques to model populations of resistive random-access memory (ReRAM) or other emerging memory devices for hardware neural network simulations. As these tables rely on data interpolation, this work explores the open questions about their fidelity in relation to the stochastic device behavior they model. We study how various jump table device models impact the attained network performance estimates, a concept we define as modeling bias. Two methods of jump table device modeling, binning and Optuna-optimized binning, are explored using synthetic data with known distributions for benchmarking purposes, as well as experimental data obtained from TiOx ReRAM devices. Results on a multi-layer perceptron trained on MNIST show that device models based on binning can behave unpredictably particularly at low number of points in the device dataset, sometimes over-promising, sometimes under-promising target network accuracy. This paper also proposes device level metrics that indicate similar trends with the modeling bias metric at the network level. The proposed approach opens the possibility for future investigations into statistical device models with better performance, as well as experimentally verified modeling bias in different in-memory computing and neural network architectures.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
A Bayesian nonparametric approach for causal inference with multiple mediators
Authors:
Samrat Roy,
Michael J. Daniels,
Brendan J. Kelly,
Jason Roy
Abstract:
Mediation analysis with contemporaneously observed multiple mediators is an important area of causal inference. Recent approaches for multiple mediators are often based on parametric models and thus may suffer from model misspecification. Also, much of the existing literature either only allow estimation of the joint mediation effect, or, estimate the joint mediation effect as the sum of individua…
▽ More
Mediation analysis with contemporaneously observed multiple mediators is an important area of causal inference. Recent approaches for multiple mediators are often based on parametric models and thus may suffer from model misspecification. Also, much of the existing literature either only allow estimation of the joint mediation effect, or, estimate the joint mediation effect as the sum of individual mediator effects, which often is not a reasonable assumption. In this paper, we propose a methodology which overcomes the two aforementioned drawbacks. Our method is based on a novel Bayesian nonparametric (BNP) approach, wherein the joint distribution of the observed data (outcome, mediators, treatment, and confounders) is modeled flexibly using an enriched Dirichlet process mixture with three levels: the first level characterizing the conditional distribution of the outcome given the mediators, treatment and the confounders, the second level corresponding to the conditional distribution of each of the mediators given the treatment and the confounders, and the third level corresponding to the distribution of the treatment and the confounders. We use standardization (g-computation) to compute causal mediation effects under three uncheckable assumptions that allow identification of the individual and joint mediation effects. The efficacy of our proposed method is demonstrated with simulations. We apply our proposed method to analyze data from a study of Ventilator-associated Pneumonia (VAP) co-infected patients, where the effect of the abundance of Pseudomonas on VAP infection is suspected to be mediated through antibiotics.
△ Less
Submitted 29 August, 2022;
originally announced August 2022.
-
Flexible evaluation of surrogacy in Bayesian adaptive platform studies
Authors:
Michael C Sachs,
Erin E Gabriel,
Alessio Crippa,
Michael J Daniels
Abstract:
Trial level surrogates are useful tools for improving the speed and cost effectiveness of trials, but surrogates that have not been properly evaluated can cause misleading results. The evaluation procedure is often contextual and depends on the type of trial setting. There have been many proposed methods for trial level surrogate evaluation, but none, to our knowledge, for the specific setting of…
▽ More
Trial level surrogates are useful tools for improving the speed and cost effectiveness of trials, but surrogates that have not been properly evaluated can cause misleading results. The evaluation procedure is often contextual and depends on the type of trial setting. There have been many proposed methods for trial level surrogate evaluation, but none, to our knowledge, for the specific setting of Bayesian adaptive platform studies. As adaptive studies are becoming more popular, methods for surrogate evaluation using them are needed. These studies also offer a rich data resource for surrogate evaluation that would not normally be possible. However, they also offer a set of statistical issues including heterogeneity of the study population, treatments, implementation, and even potentially the quality of the surrogate. We propose the use of a hierarchical Bayesian semiparametric model for the evaluation of potential surrogates using nonparametric priors for the distribution of true effects based on Dirichlet process mixtures. The motivation for this approach is to flexibly model relationships between the treatment effect on the surrogate and the treatment effect on the outcome and also to identify potential clusters with differential surrogate value in a data-driven manner. In simulations, we find that our proposed method is superior to a simple, but fairly standard, hierarchical Bayesian method. We demonstrate how our method can be used in a simulated illustrative example (based on the ProBio trial), in which we are able to identify clusters where the surrogate is, and is not useful. We plan to apply our method to the ProBio trial, once it is completed.
△ Less
Submitted 21 August, 2022;
originally announced August 2022.
-
Multi-layer State Evolution Under Random Convolutional Design
Authors:
Max Daniels,
Cédric Gerbelot,
Florent Krzakala,
Lenka Zdeborová
Abstract:
Signal recovery under generative neural network priors has emerged as a promising direction in statistical inference and computational imaging. Theoretical analysis of reconstruction algorithms under generative priors is, however, challenging. For generative priors with fully connected layers and Gaussian i.i.d. weights, this was achieved by the multi-layer approximate message (ML-AMP) algorithm v…
▽ More
Signal recovery under generative neural network priors has emerged as a promising direction in statistical inference and computational imaging. Theoretical analysis of reconstruction algorithms under generative priors is, however, challenging. For generative priors with fully connected layers and Gaussian i.i.d. weights, this was achieved by the multi-layer approximate message (ML-AMP) algorithm via a rigorous state evolution. However, practical generative priors are typically convolutional, allowing for computational benefits and inductive biases, and so the Gaussian i.i.d. weight assumption is very limiting. In this paper, we overcome this limitation and establish the state evolution of ML-AMP for random convolutional layers. We prove in particular that random convolutional layers belong to the same universality class as Gaussian matrices. Our proof technique is of an independent interest as it establishes a mapping between convolutional matrices and spatially coupled sensing matrices used in coding theory.
△ Less
Submitted 12 October, 2022; v1 submitted 26 May, 2022;
originally announced May 2022.
-
MEDFORD: A human and machine readable metadata markup language
Authors:
Polina Shpilker,
John Freeman,
Hailey McKelvie,
Jill Ashey,
Jay-Miguel Fonticella,
Hollie Putnam,
Jane Greenberg,
Lenore J. Cowen,
Alva Couch,
Noah M. Daniels
Abstract:
Reproducibility of research is essential for science. However, in the way modern computational biology research is done, it is easy to lose track of small, but extremely critical, details. Key details, such as the specific version of a software used or iteration of a genome can easily be lost in the shuffle, or perhaps not noted at all. Much work is being done on the database and storage side of t…
▽ More
Reproducibility of research is essential for science. However, in the way modern computational biology research is done, it is easy to lose track of small, but extremely critical, details. Key details, such as the specific version of a software used or iteration of a genome can easily be lost in the shuffle, or perhaps not noted at all. Much work is being done on the database and storage side of things, ensuring that there exists a space to store experiment-specific details, but current mechanisms for recording details are cumbersome for scientists to use. We propose a new metadata description language, named MEDFORD, in which scientists can record all details relevant to their research. Human-readable, easily-editable, and templatable, MEDFORD serves as a collection point for all notes that a researcher could find relevant to their research, be it for internal use or for future replication. MEDFORD has been applied to coral research, documenting research from RNA-seq analyses to photo collections.
△ Less
Submitted 16 June, 2022; v1 submitted 20 April, 2022;
originally announced April 2022.
-
Information Borrowing in Regression Models
Authors:
Amy Zhang,
Le Bao,
Michael J. Daniels
Abstract:
Model development often takes data structure, subject matter considerations, model assumptions, and goodness of fit into consideration. To diagnose issues with any of these factors, it can be helpful to understand regression model estimates at a more granular level. We propose a new method for decomposing point estimates from a regression model via weights placed on data clusters. The weights are…
▽ More
Model development often takes data structure, subject matter considerations, model assumptions, and goodness of fit into consideration. To diagnose issues with any of these factors, it can be helpful to understand regression model estimates at a more granular level. We propose a new method for decomposing point estimates from a regression model via weights placed on data clusters. The weights are informed only by the model specification and data availability and thus can be used to explicitly link the effects of data imbalance and model assumptions to actual model estimates. The weight matrix has been understood in linear models as the hat matrix in the existing literature. We extend it to Bayesian hierarchical regression models that incorporate prior information and complicated dependence structures through the covariance among random effects. We show that the model weights, which we call borrowing factors, generalize shrinkage and information borrowing to all regression models. In contrast, the focus of the hat matrix has been mainly on the diagonal elements indicating the amount of leverage. We also provide metrics that summarize the borrowing factors and are practically useful. We present the theoretical properties of the borrowing factors and associated metrics and demonstrate their usage in two examples. By explicitly quantifying borrowing and shrinkage, researchers can better incorporate domain knowledge and evaluate model performance and the impacts of data properties such as data imbalance or influential points.
△ Less
Submitted 9 January, 2022;
originally announced January 2022.
-
Variable Selection Using Bayesian Additive Regression Trees
Authors:
Chuji Luo,
Michael J. Daniels
Abstract:
Variable selection is an important statistical problem. This problem becomes more challenging when the candidate predictors are of mixed type (e.g. continuous and binary) and impact the response variable in nonlinear and/or non-additive ways. In this paper, we review existing variable selection approaches for the Bayesian additive regression trees (BART) model, a nonparametric regression model, wh…
▽ More
Variable selection is an important statistical problem. This problem becomes more challenging when the candidate predictors are of mixed type (e.g. continuous and binary) and impact the response variable in nonlinear and/or non-additive ways. In this paper, we review existing variable selection approaches for the Bayesian additive regression trees (BART) model, a nonparametric regression model, which is flexible enough to capture the interactions between predictors and nonlinear relationships with the response. An emphasis of this review is on the capability of identifying relevant predictors. We also propose two variable importance measures which can be used in a permutation-based variable selection approach, and a backward variable selection procedure for BART. We present simulations demonstrating that our approaches exhibit improved performance in terms of the ability to recover all the relevant predictors in a variety of data settings, compared to existing BART-based variable selection methods.
△ Less
Submitted 28 December, 2021;
originally announced December 2021.
-
Implementation of a Binary Neural Network on a Passive Array of Magnetic Tunnel Junctions
Authors:
Jonathan M. Goodwill,
Nitin Prasad,
Brian D. Hoskins,
Matthew W. Daniels,
Advait Madhavan,
Lei Wan,
Tiffany S. Santos,
Michael Tran,
Jordan A. Katine,
Patrick M. Braganca,
Mark D. Stiles,
Jabez J. McClelland
Abstract:
The increasing scale of neural networks and their growing application space have produced demand for more energy- and memory-efficient artificial-intelligence-specific hardware. Avenues to mitigate the main issue, the von Neumann bottleneck, include in-memory and near-memory architectures, as well as algorithmic approaches. Here we leverage the low-power and the inherently binary operation of magn…
▽ More
The increasing scale of neural networks and their growing application space have produced demand for more energy- and memory-efficient artificial-intelligence-specific hardware. Avenues to mitigate the main issue, the von Neumann bottleneck, include in-memory and near-memory architectures, as well as algorithmic approaches. Here we leverage the low-power and the inherently binary operation of magnetic tunnel junctions (MTJs) to demonstrate neural network hardware inference based on passive arrays of MTJs. In general, transferring a trained network model to hardware for inference is confronted by degradation in performance due to device-to-device variations, write errors, parasitic resistance, and nonidealities in the substrate. To quantify the effect of these hardware realities, we benchmark 300 unique weight matrix solutions of a 2-layer perceptron to classify the Wine dataset for both classification accuracy and write fidelity. Despite device imperfections, we achieve software-equivalent accuracy of up to 95.3 % with proper tuning of network parameters in 15 x 15 MTJ arrays having a range of device sizes. The success of this tuning process shows that new metrics are needed to characterize the performance and quality of networks reproduced in mixed signal hardware.
△ Less
Submitted 6 May, 2022; v1 submitted 16 December, 2021;
originally announced December 2021.
-
Easy-plane spin Hall nano-oscillators as spiking neurons for neuromorphic computing
Authors:
Danijela Marković,
Matthew W. Daniels,
Pankaj Sethi,
Andrew D. Kent,
Mark D. Stiles,
Julie Grollier
Abstract:
We show analytically using a macrospin approximation that easy-plane spin Hall nano-oscillators excited by a spin-current polarized perpendicularly to the easy-plane have phase dynamics analogous to that of Josephson junctions. Similarly to Josephson junctions, they can reproduce the spiking behavior of biological neurons that is appropriate for neuromorphic computing. We perform micromagnetic sim…
▽ More
We show analytically using a macrospin approximation that easy-plane spin Hall nano-oscillators excited by a spin-current polarized perpendicularly to the easy-plane have phase dynamics analogous to that of Josephson junctions. Similarly to Josephson junctions, they can reproduce the spiking behavior of biological neurons that is appropriate for neuromorphic computing. We perform micromagnetic simulations of such oscillators realized in the nano-constriction geometry and show that the easy-plane spiking dynamics is preserved in an experimentally feasible architecture. Finally we simulate two elementary neural network blocks that implement operations essential for neuromorphic computing. First, we show that output spikes energies from two neurons can be summed and injected into a following layer neuron and second, we demonstrate that outputs can be multiplied by synaptic weights implemented by locally modifying the anisotropy.
△ Less
Submitted 13 October, 2021;
originally announced October 2021.
-
Score-based Generative Neural Networks for Large-Scale Optimal Transport
Authors:
Max Daniels,
Tyler Maunu,
Paul Hand
Abstract:
We consider the fundamental problem of sampling the optimal transport coupling between given source and target distributions. In certain cases, the optimal transport plan takes the form of a one-to-one mapping from the source support to the target support, but learning or even approximating such a map is computationally challenging for large and high-dimensional datasets due to the high cost of li…
▽ More
We consider the fundamental problem of sampling the optimal transport coupling between given source and target distributions. In certain cases, the optimal transport plan takes the form of a one-to-one mapping from the source support to the target support, but learning or even approximating such a map is computationally challenging for large and high-dimensional datasets due to the high cost of linear programming routines and an intrinsic curse of dimensionality. We study instead the Sinkhorn problem, a regularized form of optimal transport whose solutions are couplings between the source and the target distribution. We introduce a novel framework for learning the Sinkhorn coupling between two distributions in the form of a score-based generative model. Conditioned on source data, our procedure iterates Langevin Dynamics to sample target data according to the regularized optimal coupling. Key to this approach is a neural network parametrization of the Sinkhorn problem, and we prove convergence of gradient descent with respect to network parameters in this formulation. We demonstrate its empirical success on a variety of large scale optimal transport tasks.
△ Less
Submitted 25 January, 2022; v1 submitted 7 October, 2021;
originally announced October 2021.
-
BNPqte: A Bayesian Nonparametric Approach to Causal Inference on Quantiles in R
Authors:
Chuji Luo,
Michael J. Daniels
Abstract:
In this article, we introduce the BNPqte R package which implements the Bayesian nonparametric approach of Xu, Daniels and Winterstein (2018) for estimating quantile treatment effects in observational studies. This approach provides flexible modeling of the distributions of potential outcomes, so it is capable of capturing a variety of underlying relationships among the outcomes, treatments and co…
▽ More
In this article, we introduce the BNPqte R package which implements the Bayesian nonparametric approach of Xu, Daniels and Winterstein (2018) for estimating quantile treatment effects in observational studies. This approach provides flexible modeling of the distributions of potential outcomes, so it is capable of capturing a variety of underlying relationships among the outcomes, treatments and confounders and estimating multiple quantile treatment effects simultaneously. Specifically, this approach uses a Bayesian additive regression trees (BART) model to estimate the propensity score and a Dirichlet process mixture (DPM) of multivariate normals model to estimate the conditional distribution of the potential outcome given the estimated propensity score. The BNPqte R package provides a fast implementation for this approach by designing efficient R functions for the DPM of multivariate normals model in joint and conditional density estimation. These R functions largely improve the efficiency of the DPM model in density estimation, compared to the popular DPpackage. BART-related R functions in the BNPqte R package are inherited from the BART R package with two modifications on variable importance and split probability. To maximize computational efficiency, the actual sampling and computation for each model are carried out in C++ code. The Armadillo C++ library is also used for fast linear algebra calculations.
△ Less
Submitted 28 June, 2021;
originally announced June 2021.
-
Mutual control of stochastic switching for two electrically coupled superparamagnetic tunnel junctions
Authors:
Philippe Talatchian,
Matthew W. Daniels,
Advait Madhavan,
Matthew R. Pufall,
Emilie Jué,
William H. Rippard,
Jabez J. McClelland,
Mark D. Stiles
Abstract:
Superparamagnetic tunnel junctions (SMTJs) are promising sources for the randomness required by some compact and energy-efficient computing schemes. Coupling SMTJs gives rise to collective behavior that could be useful for cognitive computing. We use a simple linear electrical circuit to mutually couple two SMTJs through their stochastic electrical transitions. When one SMTJ makes a thermally indu…
▽ More
Superparamagnetic tunnel junctions (SMTJs) are promising sources for the randomness required by some compact and energy-efficient computing schemes. Coupling SMTJs gives rise to collective behavior that could be useful for cognitive computing. We use a simple linear electrical circuit to mutually couple two SMTJs through their stochastic electrical transitions. When one SMTJ makes a thermally induced transition, the voltage across both SMTJs changes, modifying the transition rates of both. This coupling leads to significant correlation between the states of the two devices. Using fits to a generalized Néel-Brown model for the individual thermally bistable magnetic devices, we can accurately reproduce the behavior of the coupled devices with a Markov model.
△ Less
Submitted 19 August, 2021; v1 submitted 7 June, 2021;
originally announced June 2021.
-
Clustered Hierarchical Anomaly and Outlier Detection Algorithms
Authors:
Najib Ishaq,
Thomas J. Howard III,
Noah M. Daniels
Abstract:
Anomaly and outlier detection is a long-standing problem in machine learning. In some cases, anomaly detection is easy, such as when data are drawn from well-characterized distributions such as the Gaussian. However, when data occupy high-dimensional spaces, anomaly detection becomes more difficult. We present CLAM (Clustered Learning of Approximate Manifolds), a manifold mapping technique in any…
▽ More
Anomaly and outlier detection is a long-standing problem in machine learning. In some cases, anomaly detection is easy, such as when data are drawn from well-characterized distributions such as the Gaussian. However, when data occupy high-dimensional spaces, anomaly detection becomes more difficult. We present CLAM (Clustered Learning of Approximate Manifolds), a manifold mapping technique in any metric space. CLAM begins with a fast hierarchical clustering technique and then induces a graph from the cluster tree, based on overlapping clusters as selected using several geometric and topological features. Using these graphs, we implement CHAODA (Clustered Hierarchical Anomaly and Outlier Detection Algorithms), exploring various properties of the graphs and their constituent clusters to find outliers. CHAODA employs a form of transfer learning based on a training set of datasets, and applies this knowledge to a separate test set of datasets of different cardinalities, dimensionalities, and domains. On 24 publicly available datasets, we compare CHAODA (by measure of ROC AUC) to a variety of state-of-the-art unsupervised anomaly-detection algorithms. Six of the datasets are used for training. CHAODA outperforms other approaches on 16 of the remaining 18 datasets. CLAM and CHAODA scale to large, high-dimensional "big data" anomaly-detection problems, and generalize across datasets and distance functions. Source code to CLAM and CHAODA are freely available on GitHub at https://github.com/URI-ABD/clam.
△ Less
Submitted 21 November, 2021; v1 submitted 9 February, 2021;
originally announced March 2021.
-
Generator Surgery for Compressed Sensing
Authors:
Niklas Smedemark-Margulies,
Jung Yeon Park,
Max Daniels,
Rose Yu,
Jan-Willem van de Meent,
Paul Hand
Abstract:
Image recovery from compressive measurements requires a signal prior for the images being reconstructed. Recent work has explored the use of deep generative models with low latent dimension as signal priors for such problems. However, their recovery performance is limited by high representation error. We introduce a method for achieving low representation error using generators as signal priors. U…
▽ More
Image recovery from compressive measurements requires a signal prior for the images being reconstructed. Recent work has explored the use of deep generative models with low latent dimension as signal priors for such problems. However, their recovery performance is limited by high representation error. We introduce a method for achieving low representation error using generators as signal priors. Using a pre-trained generator, we remove one or more initial blocks at test time and optimize over the new, higher-dimensional latent space to recover a target image. Experiments demonstrate significantly improved reconstruction quality for a variety of network architectures. This approach also works well for out-of-training-distribution images and is competitive with other state-of-the-art methods. Our experiments show that test-time architectural modifications can greatly improve the recovery quality of generator signal priors for compressed sensing.
△ Less
Submitted 28 February, 2021; v1 submitted 22 February, 2021;
originally announced February 2021.
-
Inference for BART with Multinomial Outcomes
Authors:
Yizhen Xu,
Joseph W. Hogan,
Michael J. Daniels,
Rami Kantor,
Ann Mwangi
Abstract:
The multinomial probit Bayesian additive regression trees (MPBART) framework was proposed by Kindo et al. (KD), approximating the latent utilities in the multinomial probit (MNP) model with BART (Chipman et al. 2010). Compared to multinomial logistic models, MNP does not assume independent alternatives and the correlation structure among alternatives can be specified through multivariate Gaussian…
▽ More
The multinomial probit Bayesian additive regression trees (MPBART) framework was proposed by Kindo et al. (KD), approximating the latent utilities in the multinomial probit (MNP) model with BART (Chipman et al. 2010). Compared to multinomial logistic models, MNP does not assume independent alternatives and the correlation structure among alternatives can be specified through multivariate Gaussian distributed latent utilities. We introduce two new algorithms for fitting the MPBART and show that the theoretical mixing rates of our proposals are equal or superior to the existing algorithm in KD. Through simulations, we explore the robustness of the methods to the choice of reference level, imbalance in outcome frequencies, and the specifications of prior hyperparameters for the utility error term. The work is motivated by the application of generating posterior predictive distributions for mortality and engagement in care among HIV-positive patients based on electronic health records (EHRs) from the Academic Model Providing Access to Healthcare (AMPATH) in Kenya. In both the application and simulations, we observe better performance using our proposals as compared to KD in terms of MCMC convergence rate and posterior predictive accuracy.
△ Less
Submitted 12 August, 2022; v1 submitted 17 January, 2021;
originally announced January 2021.
-
Approximate Cross-validated Mean Estimates for Bayesian Hierarchical Regression Models
Authors:
Amy X. Zhang,
Le Bao,
Changcheng Li,
Michael J. Daniels
Abstract:
We introduce a novel procedure for obtaining cross-validated predictive estimates for Bayesian hierarchical regression models (BHRMs). Bayesian hierarchical models are popular for their ability to model complex dependence structures and provide probabilistic uncertainty estimates, but can be computationally expensive to run. Cross-validation (CV) is therefore not a common practice to evaluate the…
▽ More
We introduce a novel procedure for obtaining cross-validated predictive estimates for Bayesian hierarchical regression models (BHRMs). Bayesian hierarchical models are popular for their ability to model complex dependence structures and provide probabilistic uncertainty estimates, but can be computationally expensive to run. Cross-validation (CV) is therefore not a common practice to evaluate the predictive performance of BHRMs. Our method circumvents the need to re-run computationally costly estimation methods for each cross-validation fold and makes CV more feasible for large BHRMs. By conditioning on the variance-covariance parameters, we shift the CV problem from probability-based sampling to a simple and familiar optimization problem. In many cases, this produces estimates which are equivalent to full CV. We provide theoretical results and demonstrate its efficacy on publicly available data and in simulations.
△ Less
Submitted 27 September, 2024; v1 submitted 28 November, 2020;
originally announced November 2020.
-
A Bayesian semi-parametric approach for inference on the population partly conditional mean from longitudinal data with dropout
Authors:
Maria Josefsson,
Michael J. Daniels,
Sara Pudas
Abstract:
Studies of memory trajectories using longitudinal data often result in highly non-representative samples due to selective study enrollment and attrition. An additional bias comes from practice effects that result in improved or maintained performance due to familiarity with test content or context. These challenges may bias study findings and severely distort the ability to generalize to the targe…
▽ More
Studies of memory trajectories using longitudinal data often result in highly non-representative samples due to selective study enrollment and attrition. An additional bias comes from practice effects that result in improved or maintained performance due to familiarity with test content or context. These challenges may bias study findings and severely distort the ability to generalize to the target population. In this study we propose an approach for estimating the finite population mean of a longitudinal outcome conditioning on being alive at a specific time point. We develop a flexible Bayesian semi-parametric predictive estimator for population inference when longitudinal auxiliary information is known for the target population. We evaluate sensitivity of the results to untestable assumptions and further compare our approach to other methods used for population inference in a simulation study. The proposed approach is motivated by 15-year longitudinal data from the Betula longitudinal cohort study. We apply our approach to estimate lifespan trajectories in episodic memory, with the aim to generalize findings to a target population.
△ Less
Submitted 22 March, 2021; v1 submitted 24 November, 2020;
originally announced November 2020.
-
Informed Pooled Testing with Quantitative Assays
Authors:
Tao Liu,
Joseph W Hogan,
Wanning Su,
Yizhen Xu,
Michael J Daniels,
Kantor Rami
Abstract:
Pooled testing is widely used for screening for viral or bacterial infections with low prevalence when individual testing is not cost-efficient. Pooled testing with qualitative assays that give binary results has been well-studied. However, characteristics of pooling with quantitative assays were mostly demonstrated using simulations or empirical studies. We investigate properties of three pooling…
▽ More
Pooled testing is widely used for screening for viral or bacterial infections with low prevalence when individual testing is not cost-efficient. Pooled testing with qualitative assays that give binary results has been well-studied. However, characteristics of pooling with quantitative assays were mostly demonstrated using simulations or empirical studies. We investigate properties of three pooling strategies with quantitative assays: traditional two-stage mini-pooling (MP) (Dorfman, 1943), mini-pooling with deconvolution algorithm (MPA) (May et al., 2010), and marker-assisted MPA (mMPA) (Liu et al., 2017). MPA and mMPA test individuals in a sequence after a positive pool and implement a deconvolution algorithm to determine when testing can cease to ascertain all individual statuses. mMPA uses information from other available markers to determine an optimal order for individual testings. We derive and compare the general statistical properties of the three pooling methods. We show that with a proper pool size, MP, MPA, and mMPA can be more cost-efficient than individual testing, and mMPA is superior to MPA and MP. For diagnostic accuracy, mMPA and MPA have higher specificity and positive predictive value but lower sensitivity and negative predictive value than MP and individual testing. Included in this paper are applications to various simulations and an application for HIV treatment monitoring.
△ Less
Submitted 31 October, 2020;
originally announced November 2020.
-
Temporal State Machines: Using temporal memory to stitch time-based graph computations
Authors:
Advait Madhavan,
Matthew Daniels,
Mark Stiles
Abstract:
Race logic, an arrival-time-coded logic family, has demonstrated energy and performance improvements for applications ranging from dynamic programming to machine learning. However, the ad hoc mappings of algorithms into hardware result in custom architectures making them difficult to generalize. We systematize the development of race logic by associating it with the mathematical field called tropi…
▽ More
Race logic, an arrival-time-coded logic family, has demonstrated energy and performance improvements for applications ranging from dynamic programming to machine learning. However, the ad hoc mappings of algorithms into hardware result in custom architectures making them difficult to generalize. We systematize the development of race logic by associating it with the mathematical field called tropical algebra. This association between the mathematical primitives of tropical algebra and generalized race logic computations guides the design of temporally coded tropical circuits. It also serves as a framework for expressing high level timing-based algorithms. This abstraction, when combined with temporal memory, allows for the systematic generalization of race logic by making it possible to partition feed-forward computations into stages and organizing them into a state machine. We leverage analog memristor-based temporal memories to design a such a state machine that operates purely on time-coded wavefronts. We implement a version of Dijkstra's algorithm to evaluate this temporal state machine. This demonstration shows the promise of expanding the expressibility of temporal computing to enable it to deliver significant energy and throughput advantages.
△ Less
Submitted 29 September, 2020;
originally announced September 2020.
-
Temporal Memory with Magnetic Racetracks
Authors:
Hamed Vakili,
Mohammad Nazmus Sakib,
Samiran Ganguly,
Mircea Stan,
Matthew W. Daniels,
Advait Madhavan,
Mark D. Stiles,
Avik W. Ghosh
Abstract:
Race logic is a relative timing code that represents information in a wavefront of digital edges on a set of wires in order to accelerate dynamic programming and machine learning algorithms. Skyrmions, bubbles, and domain walls are mobile magnetic configurations (solitons) with applications for Boolean data storage. We propose to use current-induced displacement of these solitons on magnetic racet…
▽ More
Race logic is a relative timing code that represents information in a wavefront of digital edges on a set of wires in order to accelerate dynamic programming and machine learning algorithms. Skyrmions, bubbles, and domain walls are mobile magnetic configurations (solitons) with applications for Boolean data storage. We propose to use current-induced displacement of these solitons on magnetic racetracks as a native temporal memory for race logic computing. Locally synchronized racetracks can spatially store relative timings of digital edges and provide non-destructive read-out. The linear kinematics of skyrmion motion, the tunability and low-voltage asynchronous operation of the proposed device, and the elimination of any need for constant skyrmion nucleation make these magnetic racetracks a natural memory for low-power, high-throughput race logic applications.
△ Less
Submitted 21 May, 2020;
originally announced May 2020.
-
Magnesium-intercalated graphene on SiC: highly n-doped air-stable bilayer graphene at extreme displacement fields
Authors:
Antonija Grubišić-Čabo,
Jimmy C. Kotsakidis,
Yuefeng Yin,
Anton Tadich,
Matthew Haldon,
Sean Solari,
Iolanda di Bernardo,
Kevin M. Daniels,
John Riley,
Eric Huwald,
Mark T. Edmonds,
Rachael Myers-Ward,
Nikhil V. Medhekar,
D. Kurt Gaskill,
Michael S. Fuhrer
Abstract:
We use angle-resolved photoemission spectroscopy to investigate the electronic structure of bilayer graphene at high n-doping and extreme displacement fields, created by intercalating epitaxial monolayer graphene on silicon carbide with magnesium to form quasi-freestanding bilayer graphene on magnesium-terminated silicon carbide. Angle-resolved photoemission spectroscopy reveals that upon magnesiu…
▽ More
We use angle-resolved photoemission spectroscopy to investigate the electronic structure of bilayer graphene at high n-doping and extreme displacement fields, created by intercalating epitaxial monolayer graphene on silicon carbide with magnesium to form quasi-freestanding bilayer graphene on magnesium-terminated silicon carbide. Angle-resolved photoemission spectroscopy reveals that upon magnesium intercalation, the single massless Dirac band of epitaxial monolayer graphene is transformed into the characteristic massive double-band Dirac spectrum of quasi-freestanding bilayer graphene. Analysis of the spectrum using a simple tight binding model indicates that magnesium intercalation results in an n-type doping of 2.1 $\times$ 10$^{14}$ cm$^{-2}$, creates an extremely high displacement field of 2.6 V/nm, opening a considerable gap of 0.36 eV at the Dirac point. This is further confirmed by density-functional theory calculations for quasi-freestanding bilayer graphene on magnesium-terminated silicon carbide, which show a similar doping level, displacement field and bandgap. Finally, magnesium-intercalated samples are surprisingly robust to ambient conditions; no significant changes in the electronic structure are observed after 30 minutes exposure in air.
△ Less
Submitted 27 August, 2020; v1 submitted 6 May, 2020;
originally announced May 2020.
-
Memory-efficient training with streaming dimensionality reduction
Authors:
Siyuan Huang,
Brian D. Hoskins,
Matthew W. Daniels,
Mark D. Stiles,
Gina C. Adam
Abstract:
The movement of large quantities of data during the training of a Deep Neural Network presents immense challenges for machine learning workloads. To minimize this overhead, especially on the movement and calculation of gradient information, we introduce streaming batch principal component analysis as an update algorithm. Streaming batch principal component analysis uses stochastic power iterations…
▽ More
The movement of large quantities of data during the training of a Deep Neural Network presents immense challenges for machine learning workloads. To minimize this overhead, especially on the movement and calculation of gradient information, we introduce streaming batch principal component analysis as an update algorithm. Streaming batch principal component analysis uses stochastic power iterations to generate a stochastic k-rank approximation of the network gradient. We demonstrate that the low rank updates produced by streaming batch principal component analysis can effectively train convolutional neural networks on a variety of common datasets, with performance comparable to standard mini batch gradient descent. These results can lead to both improvements in the design of application specific integrated circuits for deep learning and in the speed of synchronization of machine learning models trained with data parallelism.
△ Less
Submitted 24 April, 2020;
originally announced April 2020.
-
Freestanding n-Doped Graphene via Intercalation of Calcium and Magnesium into the Buffer Layer - SiC(0001) Interface
Authors:
Jimmy C. Kotsakidis,
Antonija Grubišić-Čabo,
Yuefeng Yin,
Anton Tadich,
Rachael L. Myers-Ward,
Matthew Dejarld,
Shojan P. Pavunny,
Marc Currie,
Kevin M. Daniels,
Chang Liu,
Mark T. Edmonds,
Nikhil V. Medhekar,
D. Kurt Gaskill,
Amadeo L. Vazquez de Parga,
Michael S. Fuhrer
Abstract:
The intercalation of epitaxial graphene on SiC(0001) with Ca has been studied extensively, yet precisely where the Ca resides remains elusive. Furthermore, the intercalation of Mg underneath epitaxial graphene on SiC(0001) has not been reported. Here, we use low energy electron diffraction, x-ray photoelectron spectroscopy, secondary electron cut-off photoemission and scanning tunneling microscopy…
▽ More
The intercalation of epitaxial graphene on SiC(0001) with Ca has been studied extensively, yet precisely where the Ca resides remains elusive. Furthermore, the intercalation of Mg underneath epitaxial graphene on SiC(0001) has not been reported. Here, we use low energy electron diffraction, x-ray photoelectron spectroscopy, secondary electron cut-off photoemission and scanning tunneling microscopy to elucidate the physical and electronic structure of both Ca- and Mg-intercalated epitaxial graphene on 6H-SiC(0001). We find that Ca intercalates underneath the buffer layer and bonds to the Si-terminated SiC surface, breaking the C-Si bonds of the buffer layer i.e. 'freestanding' the buffer layer to form Ca-intercalated quasi-freestanding bilayer graphene (Ca-QFSBLG). The situation is similar for the Mg-intercalation of epitaxial graphene on SiC(0001), where an ordered Mg-terminated reconstruction at the SiC surface and Mg bonds to the Si-terminated SiC surface are formed, resulting in Mg-intercalated quasi-freestanding bilayer graphene (Mg-QFSBLG). Ca-intercalation underneath the buffer layer has not been considered in previous studies of Ca-intercalated epitaxial graphene. Furthermore, we find no evidence that either Ca or Mg intercalates between graphene layers. However, we do find that both Ca-QFSBLG and Mg-QFSBLG exhibit very low workfunctions of 3.68 and 3.78 eV, respectively, indicating high n-type doping. Upon exposure to ambient conditions, we find Ca-QFSBLG degrades rapidly, whereas Mg-QFSBLG remains remarkably stable.
△ Less
Submitted 13 July, 2020; v1 submitted 3 April, 2020;
originally announced April 2020.
-
Reducing the Representation Error of GAN Image Priors Using the Deep Decoder
Authors:
Max Daniels,
Paul Hand,
Reinhard Heckel
Abstract:
Generative models, such as GANs, learn an explicit low-dimensional representation of a particular class of images, and so they may be used as natural image priors for solving inverse problems such as image restoration and compressive sensing. GAN priors have demonstrated impressive performance on these tasks, but they can exhibit substantial representation error for both in-distribution and out-of…
▽ More
Generative models, such as GANs, learn an explicit low-dimensional representation of a particular class of images, and so they may be used as natural image priors for solving inverse problems such as image restoration and compressive sensing. GAN priors have demonstrated impressive performance on these tasks, but they can exhibit substantial representation error for both in-distribution and out-of-distribution images, because of the mismatch between the learned, approximate image distribution and the data generating distribution. In this paper, we demonstrate a method for reducing the representation error of GAN priors by modeling images as the linear combination of a GAN prior with a Deep Decoder. The deep decoder is an underparameterized and most importantly unlearned natural signal model similar to the Deep Image Prior. No knowledge of the specific inverse problem is needed in the training of the GAN underlying our method. For compressive sensing and image superresolution, our hybrid model exhibits consistently higher PSNRs than both the GAN priors and Deep Decoder separately, both on in-distribution and out-of-distribution images. This model provides a method for extensibly and cheaply leveraging both the benefits of learned and unlearned image recovery priors in inverse problems.
△ Less
Submitted 23 January, 2020;
originally announced January 2020.
-
Energy-efficient stochastic computing with superparamagnetic tunnel junctions
Authors:
Matthew W. Daniels,
Advait Madhavan,
Philippe Talatchian,
Alice Mizrahi,
Mark D. Stiles
Abstract:
Superparamagnetic tunnel junctions (SMTJs) have emerged as a competitive, realistic nanotechnology to support novel forms of stochastic computation in CMOS-compatible platforms. One of their applications is to generate random bitstreams suitable for use in stochastic computing implementations. We describe a method for digitally programmable bitstream generation based on pre-charge sense amplifiers…
▽ More
Superparamagnetic tunnel junctions (SMTJs) have emerged as a competitive, realistic nanotechnology to support novel forms of stochastic computation in CMOS-compatible platforms. One of their applications is to generate random bitstreams suitable for use in stochastic computing implementations. We describe a method for digitally programmable bitstream generation based on pre-charge sense amplifiers. This generator is significantly more energy efficient than SMTJ-based bitstream generators that tune probabilities with spin currents and a factor of two more efficient than related CMOS-based implementations. The true randomness of this bitstream generator allows us to use them as the fundamental units of a novel neural network architecture. To take advantage of the potential savings, we codesign the algorithm with the circuit, rather than directly transcribing a classical neural network into hardware. The flexibility of the neural network mathematics allows us to adapt the network to the explicitly energy efficient choices we make at the device level. The result is a convolutional neural network design operating at $\approx$ 150 nJ per inference with 97 % performance on MNIST -- a factor of 1.4 to 7.7 improvement in energy efficiency over comparable proposals in the recent literature.
△ Less
Submitted 6 March, 2020; v1 submitted 25 November, 2019;
originally announced November 2019.
-
Clustered Hierarchical Entropy-Scaling Search of Astronomical and Biological Data
Authors:
Najib Ishaq,
George Student,
Noah M. Daniels
Abstract:
Both astronomy and biology are experiencing explosive growth of data, resulting in a "big data" problem that stands in the way of a "big data" opportunity for discovery. One common question asked of such data is that of approximate search ($ρ-$nearest neighbors search). We present a hierarchical search algorithm for such data sets that takes advantage of particular geometric properties apparent in…
▽ More
Both astronomy and biology are experiencing explosive growth of data, resulting in a "big data" problem that stands in the way of a "big data" opportunity for discovery. One common question asked of such data is that of approximate search ($ρ-$nearest neighbors search). We present a hierarchical search algorithm for such data sets that takes advantage of particular geometric properties apparent in both astronomical and biological data sets, namely the metric entropy and fractal dimensionality of the data. We present CHESS (Clustered Hierarchical Entropy-Scaling Search), a search tool with virtually no loss in specificity or sensitivity, demonstrating a $13.6\times$ speedup over linear search on the Sloan Digital Sky Survey's APOGEE data set and a $68\times$ speedup on the GreenGenes 16S metagenomic data set, as well as asymptotically fewer distance comparisons on APOGEE when compared to the FALCONN locality-sensitive hashing library. CHESS demonstrates an asymptotic complexity not directly dependent on data set size, and is in practice at least an order of magnitude faster than linear search by performing fewer distance comparisons. Unlike locality-sensitive hashing approaches, CHESS can work with any user-defined distance function. CHESS also allows for implicit data compression, which we demonstrate on the APOGEE data set. We also discuss an extension allowing for efficient k-nearest neighbors search.
△ Less
Submitted 10 November, 2019; v1 submitted 22 August, 2019;
originally announced August 2019.
-
Invertible generative models for inverse problems: mitigating representation error and dataset bias
Authors:
Muhammad Asim,
Max Daniels,
Oscar Leong,
Ali Ahmed,
Paul Hand
Abstract:
Trained generative models have shown remarkable performance as priors for inverse problems in imaging -- for example, Generative Adversarial Network priors permit recovery of test images from 5-10x fewer measurements than sparsity priors. Unfortunately, these models may be unable to represent any particular image because of architectural choices, mode collapse, and bias in the training dataset. In…
▽ More
Trained generative models have shown remarkable performance as priors for inverse problems in imaging -- for example, Generative Adversarial Network priors permit recovery of test images from 5-10x fewer measurements than sparsity priors. Unfortunately, these models may be unable to represent any particular image because of architectural choices, mode collapse, and bias in the training dataset. In this paper, we demonstrate that invertible neural networks, which have zero representation error by design, can be effective natural signal priors at inverse problems such as denoising, compressive sensing, and inpainting. Given a trained generative model, we study the empirical risk formulation of the desired inverse problem under a regularization that promotes high likelihood images, either directly by penalization or algorithmically by initialization. For compressive sensing, invertible priors can yield higher accuracy than sparsity priors across almost all undersampling ratios, and due to their lack of representation error, invertible priors can yield better reconstructions than GAN priors for images that have rare features of variation within the biased training set, including out-of-distribution natural images. We additionally compare performance for compressive sensing to unlearned methods, such as the deep decoder, and we establish theoretical bounds on expected recovery error in the case of a linear invertible model.
△ Less
Submitted 12 July, 2020; v1 submitted 28 May, 2019;
originally announced May 2019.
-
A Bayesian Nonparametric Approach for Evaluating the Causal Effect of Treatment in Randomized Trials with Semi-Competing Risks
Authors:
Yanxun Xu,
Daniel Scharfstein,
Peter Müller,
Michael Daniels
Abstract:
We develop a Bayesian nonparametric (BNP) approach to evaluate the causal effect of treatment in a randomized trial where a nonterminal event may be censored by a terminal event, but not vice versa (i.e., semi-competing risks). Based on the idea of principal stratification, we define a novel estimand for the causal effect of treatment on the nonterminal event. We introduce identification assumptio…
▽ More
We develop a Bayesian nonparametric (BNP) approach to evaluate the causal effect of treatment in a randomized trial where a nonterminal event may be censored by a terminal event, but not vice versa (i.e., semi-competing risks). Based on the idea of principal stratification, we define a novel estimand for the causal effect of treatment on the nonterminal event. We introduce identification assumptions, indexed by a sensitivity parameter, and show how to draw inference using our BNP approach. We conduct simulation studies and illustrate our methodology using data from a brain cancer trial.
△ Less
Submitted 21 July, 2019; v1 submitted 20 March, 2019;
originally announced March 2019.
-
Streaming Batch Eigenupdates for Hardware Neuromorphic Networks
Authors:
Brian D. Hoskins,
Matthew W. Daniels,
Siyuan Huang,
Advait Madhavan,
Gina C. Adam,
Nikolai Zhitenev,
Jabez J. McClelland,
Mark D. Stiles
Abstract:
Neuromorphic networks based on nanodevices, such as metal oxide memristors, phase change memories, and flash memory cells, have generated considerable interest for their increased energy efficiency and density in comparison to graphics processing units (GPUs) and central processing units (CPUs). Though immense acceleration of the training process can be achieved by leveraging the fact that the tim…
▽ More
Neuromorphic networks based on nanodevices, such as metal oxide memristors, phase change memories, and flash memory cells, have generated considerable interest for their increased energy efficiency and density in comparison to graphics processing units (GPUs) and central processing units (CPUs). Though immense acceleration of the training process can be achieved by leveraging the fact that the time complexity of training does not scale with the network size, it is limited by the space complexity of stochastic gradient descent, which grows quadratically. The main objective of this work is to reduce this space complexity by using low-rank approximations of stochastic gradient descent. This low spatial complexity combined with streaming methods allows for significant reductions in memory and compute overhead, opening the doors for improvements in area, time and energy efficiency of training. We refer to this algorithm and architecture to implement it as the streaming batch eigenupdate (SBE) approach.
△ Less
Submitted 4 March, 2019;
originally announced March 2019.
-
Bayesian semi-parametric G-computation for causal inference in a cohort study with MNAR dropout and death
Authors:
Maria Josefsson,
Michael J. Daniels
Abstract:
Causal inference with observational longitudinal data and time-varying exposures is often complicated by time-dependent confounding and attrition. The G-computation formula is one approach for estimating a causal effect in this setting. The parametric modeling approach typically used in practice relies on strong modeling assumptions for valid inference, and moreover depends on an assumption of mis…
▽ More
Causal inference with observational longitudinal data and time-varying exposures is often complicated by time-dependent confounding and attrition. The G-computation formula is one approach for estimating a causal effect in this setting. The parametric modeling approach typically used in practice relies on strong modeling assumptions for valid inference, and moreover depends on an assumption of missing at random, which is not appropriate when the missingness is missing not at random (MNAR) or due to death. In this work we develop a flexible Bayesian semi-parametric G-computation approach for assessing the causal effect on the subpopulation that would survive irrespective of exposure, in a setting with MNAR dropout. The approach is to specify models for the observed data using Bayesian additive regression trees, and then use assumptions with embedded sensitivity parameters to identify and estimate the causal effect. The proposed approach is motivated by a longitudinal cohort study on cognition, health, and aging, and we apply our approach to study the effect of becoming a widow on memory. We also compare our approach to several standard methods.
△ Less
Submitted 12 October, 2020; v1 submitted 27 February, 2019;
originally announced February 2019.
-
Topological spin Hall effects and tunable skyrmion Hall effects in uniaxial antiferromagnetic insulators
Authors:
Matthew W. Daniels,
Weichao Yu,
Ran Cheng,
Jiang Xiao,
Di Xiao
Abstract:
Recent advances in the physics of current-driven antiferromagnetic skyrmions have observed the absence of a Magnus force. We outline the symmetry reasons for this phenomenon, and show that this cancellation will fail in the case of spin polarized currents. Pairing micromagnetic simulations with semiclassical spin wave transport theory, we demonstrate that skyrmions produce a spin-polarized transve…
▽ More
Recent advances in the physics of current-driven antiferromagnetic skyrmions have observed the absence of a Magnus force. We outline the symmetry reasons for this phenomenon, and show that this cancellation will fail in the case of spin polarized currents. Pairing micromagnetic simulations with semiclassical spin wave transport theory, we demonstrate that skyrmions produce a spin-polarized transverse magnon current, and that spin-polarized magnon currents can in turn produce transverse motion of antiferromagnetic skyrmions. We examine qualitative differences in the frequency dependence of the skyrmion Hall angle between ferromagnetic and antiferromagnetic cases, and close by proposing a simple skyrmion-based magnonic device for demultiplexing of spin channels.
△ Less
Submitted 30 May, 2019; v1 submitted 25 February, 2019;
originally announced February 2019.
-
Bayesian Methods for Multiple Mediators: Relating Principal Stratification and Causal Mediation in the Analysis of Power Plant Emission Controls
Authors:
Chanmin Kim,
Michael Daniels,
Joseph Hogan,
Christine Choirat,
Corwin Zigler
Abstract:
Emission control technologies installed on power plants are a key feature of many air pollution regulations in the US. While such regulations are predicated on the presumed relationships between emissions, ambient air pollution, and human health, many of these relationships have never been empirically verified. The goal of this paper is to develop new statistical methods to quantify these relation…
▽ More
Emission control technologies installed on power plants are a key feature of many air pollution regulations in the US. While such regulations are predicated on the presumed relationships between emissions, ambient air pollution, and human health, many of these relationships have never been empirically verified. The goal of this paper is to develop new statistical methods to quantify these relationships. We frame this problem as one of mediation analysis to evaluate the extent to which the effect of a particular control technology on ambient pollution is mediated through causal effects on power plant emissions. Since power plants emit various compounds that contribute to ambient pollution, we develop new methods for multiple intermediate variables that are measured contemporaneously, may interact with one another, and may exhibit joint mediating effects. Specifically, we propose new methods leveraging two related frameworks for causal inference in the presence of mediating variables: principal stratification and causal mediation analysis. We define principal effects based on multiple mediators, and also introduce a new decomposition of the total effect of an intervention on ambient pollution into the natural direct effect and natural indirect effects for all combinations of mediators. Both approaches are anchored to the same observed-data models, which we specify with Bayesian nonparametric techniques. We provide assumptions for estimating principal causal effects, then augment these with an additional assumption required for causal mediation analysis. The two analyses, interpreted in tandem, provide the first empirical investigation of the presumed causal pathways that motivate important air quality regulatory policies.
△ Less
Submitted 16 February, 2019;
originally announced February 2019.
-
Bayesian Longitudinal Causal Inference in the Analysis of the Public Health Impact of Pollutant Emissions
Authors:
Chanmin Kim,
Corwin M Zigler,
Michael J Daniels,
Christine Choirat,
Jason A Roy
Abstract:
Pollutant emissions from coal-burning power plants have been deemed to adversely impact ambient air quality and public health conditions. Despite the noticeable reduction in emissions and the improvement of air quality since the Clean Air Act (CAA) became the law, the public-health benefits from changes in emissions have not been widely evaluated yet. In terms of the chain of accountability (HEI A…
▽ More
Pollutant emissions from coal-burning power plants have been deemed to adversely impact ambient air quality and public health conditions. Despite the noticeable reduction in emissions and the improvement of air quality since the Clean Air Act (CAA) became the law, the public-health benefits from changes in emissions have not been widely evaluated yet. In terms of the chain of accountability (HEI Accountability Working Group, 2003), the link between pollutant emissions from the power plants (SO2) and public health conditions (respiratory diseases) accounting for changes in ambient air quality (PM2.5) is unknown. We provide the first assessment of the longitudinal effect of specific pollutant emission (SO2) on public health outcomes that is mediated through changes in the ambient air quality. It is of particular interest to examine the extent to which the effect that is mediated through changes in local ambient air quality differs from year to year. In this paper, we propose a Bayesian approach to estimate novel causal estimands: time-varying mediation effects in the presence of mediators and responses measured every year. We replace the commonly invoked sequential ignorability assumption with a new set of assumptions which are sufficient to identify the distributions of the natural indirect and direct effects in this setting.
△ Less
Submitted 3 January, 2019;
originally announced January 2019.
-
Spin chirality fluctuation in two-dimensional ferromagnets with perpendicular anisotropy
Authors:
Wenbo Wang,
Matthew W. Daniels,
Zhaoliang Liao,
Yifan Zhao,
Jun Wang,
Gertjan Koster,
Guus Rijnders,
Cui-zu Chang,
Di Xiao,
Weida Wu
Abstract:
Non-coplanar spin textures with scalar spin chirality can generate effective magnetic field that deflects the motion of charge carriers, resulting in topological Hall effect (THE), a powerful probe of the ground state and low-energy excitations of correlated systems. However, spin chirality fluctuation in two-dimensional ferromagnets with perpendicular anisotropy has not been considered in prior s…
▽ More
Non-coplanar spin textures with scalar spin chirality can generate effective magnetic field that deflects the motion of charge carriers, resulting in topological Hall effect (THE), a powerful probe of the ground state and low-energy excitations of correlated systems. However, spin chirality fluctuation in two-dimensional ferromagnets with perpendicular anisotropy has not been considered in prior studies. Herein, we report direct evidence of universal spin chirality fluctuation by probing the THE above the transition temperatures in two different ferromagnetic ultra-thin films, SrRuO$_3$ and V doped Sb$_2$Te$_3$. The temperature, magnetic field, thickness, and carrier type dependences of the THE signal, along with our Monte-Carlo simulations, unambiguously demonstrate that the spin chirality fluctuation is a universal phenomenon in two-dimensional Ising ferromagnets. Our discovery opens a new paradigm of exploring the spin chirality with topological Hall transport in two-dimensional magnets and beyond
△ Less
Submitted 18 July, 2019; v1 submitted 17 December, 2018;
originally announced December 2018.
-
Classification using Ensemble Learning under Weighted Misclassification Loss
Authors:
Yizhen Xu,
Tao Liu,
Michael J. Daniels,
Rami Kantor,
Ann Mwangi,
Joseph W. Hogan
Abstract:
Binary classification rules based on covariates typically depend on simple loss functions such as zero-one misclassification. Some cases may require more complex loss functions. For example, individual-level monitoring of HIV-infected individuals on antiretroviral therapy (ART) requires periodic assessment of treatment failure, defined as having a viral load (VL) value above a certain threshold. I…
▽ More
Binary classification rules based on covariates typically depend on simple loss functions such as zero-one misclassification. Some cases may require more complex loss functions. For example, individual-level monitoring of HIV-infected individuals on antiretroviral therapy (ART) requires periodic assessment of treatment failure, defined as having a viral load (VL) value above a certain threshold. In some resource limited settings, VL tests may be limited by cost or technology, and diagnoses are based on other clinical markers. Depending on scenario, higher premium may be placed on avoiding false-positives which brings greater cost and reduced treatment options. Here, the optimal rule is determined by minimizing a weighted misclassification loss/risk.
We propose a method for finding and cross-validating optimal binary classification rules under weighted misclassification loss. We focus on rules comprising a prediction score and an associated threshold, where the score is derived using an ensemble learner. Simulations and examples show that our method, which derives the score and threshold jointly, more accurately estimates overall risk and has better operating characteristics compared with methods that derive the score first and the cutoff conditionally on the score especially for finite samples.
△ Less
Submitted 10 May, 2019; v1 submitted 16 December, 2018;
originally announced December 2018.