-
Magnetic susceptibility and entanglement of three interacting qubits under magnetic field and anisotropy
Authors:
Bastian Castorene,
Francisco J. Peña,
Ariel Norambuena,
Sergio E. Ulloa,
Cristobal Araya,
Patricio Vargas
Abstract:
This work investigates a system of three entangled qubits within the XXX model, subjected to an external magnetic field in the $z$-direction and incorporating an anisotropy term along the $y$-axis. We explore the thermodynamics of the system by calculating its magnetic susceptibility and analyzing how this quantity encodes information about entanglement. By deriving rigorous bounds for susceptibil…
▽ More
This work investigates a system of three entangled qubits within the XXX model, subjected to an external magnetic field in the $z$-direction and incorporating an anisotropy term along the $y$-axis. We explore the thermodynamics of the system by calculating its magnetic susceptibility and analyzing how this quantity encodes information about entanglement. By deriving rigorous bounds for susceptibility, we demonstrate that their violation serves as an entanglement witness. Our results show that anisotropy enhances entanglement, extending the temperature range over which it persists. Additionally, by tracing over the degrees of freedom of two qubits, we examine the reduced density matrix of the remaining qubits and find that its entropy under the influence of the magnetic field can be mapped to an effective thermal bath at $(B,K) > 0$ K.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
Performance Evaluation of Deep Learning and Transformer Models Using Multimodal Data for Breast Cancer Classification
Authors:
Sadam Hussain,
Mansoor Ali,
Usman Naseem,
Beatriz Alejandra Bosques Palomo,
Mario Alexis Monsivais Molina,
Jorge Alberto Garza Abdala,
Daly Betzabeth Avendano Avalos,
Servando Cardona-Huerta,
T. Aaron Gulliver,
Jose Gerardo Tamez Pena
Abstract:
Rising breast cancer (BC) occurrence and mortality are major global concerns for women. Deep learning (DL) has demonstrated superior diagnostic performance in BC classification compared to human expert readers. However, the predominant use of unimodal (digital mammography) features may limit the current performance of diagnostic models. To address this, we collected a novel multimodal dataset comp…
▽ More
Rising breast cancer (BC) occurrence and mortality are major global concerns for women. Deep learning (DL) has demonstrated superior diagnostic performance in BC classification compared to human expert readers. However, the predominant use of unimodal (digital mammography) features may limit the current performance of diagnostic models. To address this, we collected a novel multimodal dataset comprising both imaging and textual data. This study proposes a multimodal DL architecture for BC classification, utilising images (mammograms; four views) and textual data (radiological reports) from our new in-house dataset. Various augmentation techniques were applied to enhance the training data size for both imaging and textual data. We explored the performance of eleven SOTA DL architectures (VGG16, VGG19, ResNet34, ResNet50, MobileNet-v3, EffNet-b0, EffNet-b1, EffNet-b2, EffNet-b3, EffNet-b7, and Vision Transformer (ViT)) as imaging feature extractors. For textual feature extraction, we utilised either artificial neural networks (ANNs) or long short-term memory (LSTM) networks. The combined imaging and textual features were then inputted into an ANN classifier for BC classification, using the late fusion technique. We evaluated different feature extractor and classifier arrangements. The VGG19 and ANN combinations achieved the highest accuracy of 0.951. For precision, the VGG19 and ANN combination again surpassed other CNN and LSTM, ANN based architectures by achieving a score of 0.95. The best sensitivity score of 0.903 was achieved by the VGG16+LSTM. The highest F1 score of 0.931 was achieved by VGG19+LSTM. Only the VGG16+LSTM achieved the best area under the curve (AUC) of 0.937, with VGG16+LSTM closely following with a 0.929 AUC score.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
Sharp Bounds of the Causal Effect Under MNAR Confounding
Authors:
Jose M. Peña
Abstract:
We report bounds for any contrast between the probabilities of the counterfactual outcome under exposure and non-exposure when the confounders are missing not at random. We assume that the missingness mechanism is outcome-independent, and prove that our bounds are arbitrarily sharp, i.e., practically attainable or logically possible.
We report bounds for any contrast between the probabilities of the counterfactual outcome under exposure and non-exposure when the confounders are missing not at random. We assume that the missingness mechanism is outcome-independent, and prove that our bounds are arbitrarily sharp, i.e., practically attainable or logically possible.
△ Less
Submitted 9 October, 2024;
originally announced October 2024.
-
Comparative study of regression vs pairwise models for surrogate-based heuristic optimisation
Authors:
Pablo S. Naharro,
Pablo Toharia,
Antonio LaTorre,
José-María Peña
Abstract:
Heuristic optimisation algorithms explore the search space by sampling solutions, evaluating their fitness, and biasing the search in the direction of promising solutions. However, in many cases, this fitness function involves executing expensive computational calculations, drastically reducing the reasonable number of evaluations. In this context, surrogate models have emerged as an excellent alt…
▽ More
Heuristic optimisation algorithms explore the search space by sampling solutions, evaluating their fitness, and biasing the search in the direction of promising solutions. However, in many cases, this fitness function involves executing expensive computational calculations, drastically reducing the reasonable number of evaluations. In this context, surrogate models have emerged as an excellent alternative to alleviate these computational problems. This paper addresses the formulation of surrogate problems as both regression models that approximate fitness (surface surrogate models) and a novel way to connect classification models (pairwise surrogate models). The pairwise approach can be directly exploited by some algorithms, such as Differential Evolution, in which the fitness value is not actually needed to drive the search, and it is sufficient to know whether a solution is better than another one or not. Based on these modelling approaches, we have conducted a multidimensional analysis of surrogate models under different configurations: different machine learning algorithms (regularised regression, neural networks, decision trees, boosting methods, and random forests), different surrogate strategies (encouraging diversity or relaxing prediction thresholds), and compare them for both surface and pairwise surrogate models. The experimental part of the article includes the benchmark problems already proposed for the SOCO2011 competition in continuous optimisation and a simulation problem included in the recent GECCO2021 Industrial Challenge. This paper shows that the performance of the overall search, when using online machine learning-based surrogate models, depends not only on the accuracy of the predictive model but also on both the kind of bias towards positive or negative cases and how the optimisation uses those predictions to decide whether to execute the actual fitness function.
△ Less
Submitted 4 October, 2024;
originally announced October 2024.
-
3D Segmentation of Neuronal Nuclei and Cell-Type Identification using Multi-channel Information
Authors:
Antonio LaTorre,
Lidia Alonso-Nanclares,
José María Peña,
Javier De Felipe
Abstract:
Background Analyzing images to accurately estimate the number of different cell types in the brain using automatic methods is a major objective in neuroscience. The automatic and selective detection and segmentation of neurons would be an important step in neuroanatomical studies. New method We present a method to improve the 3D reconstruction of neuronal nuclei that allows their segmentation, exc…
▽ More
Background Analyzing images to accurately estimate the number of different cell types in the brain using automatic methods is a major objective in neuroscience. The automatic and selective detection and segmentation of neurons would be an important step in neuroanatomical studies. New method We present a method to improve the 3D reconstruction of neuronal nuclei that allows their segmentation, excluding the nuclei of non-neuronal cell types. Results We have tested the algorithm on stacks of images from rat neocortex, in a complex scenario (large stacks of images, uneven staining, and three different channels to visualize different cellular markers). It was able to provide a good identification ratio of neuronal nuclei and a 3D segmentation. Comparison with Existing Methods: Many automatic tools are in fact currently available, but different methods yield different cell count estimations, even in the same brain regions, due to differences in the labeling and imaging techniques, as well as in the algorithms used to detect cells. Moreover, some of the available automated software methods have provided estimations of cell numbers that have been reported to be inaccurate or inconsistent after evaluation by neuroanatomists. Conclusions It is critical to have a tool for automatic segmentation that allows discrimination between neurons, glial cells and perivascular cells. It would greatly speed up a task that is currently performed manually and would allow the cell counting to be systematic, avoiding human bias. Furthermore, the resulting 3D reconstructions of different cell types can be used to generate models of the spatial distribution of cells.
△ Less
Submitted 4 October, 2024;
originally announced October 2024.
-
Syndeo: Portable Ray Clusters with Secure Containerization
Authors:
William Li,
Rodney S. Lafuente Mercado,
Jaime D. Pena,
Ross E. Allen
Abstract:
We present Syndeo: a software framework for container orchestration of Ray on Slurm. In general the idea behind Syndeo is to write code once and deploy anywhere. Specifically, Syndeo is designed to addresses the issues of portability, scalability, and security for parallel computing. The design is portable because the containerized Ray code can be re-deployed on Amazon Web Services, Microsoft Azur…
▽ More
We present Syndeo: a software framework for container orchestration of Ray on Slurm. In general the idea behind Syndeo is to write code once and deploy anywhere. Specifically, Syndeo is designed to addresses the issues of portability, scalability, and security for parallel computing. The design is portable because the containerized Ray code can be re-deployed on Amazon Web Services, Microsoft Azure, Google Cloud, or Alibaba Cloud. The process is scalable because we optimize for multi-node, high-throughput computing. The process is secure because users are forced to operate with unprivileged profiles meaning administrators control the access permissions. We demonstrate Syndeo's portable, scalable, and secure design by deploying containerized parallel workflows on Slurm for which Ray does not officially support.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Model calibration using a parallel differential evolution algorithm in computational neuroscience: simulation of stretch induced nerve deficit
Authors:
Antonio LaTorre,
Man Ting Kwong,
Julián A. García-Grajales,
Riyi Shi,
Antoine Jérusalem,
José-María Peña
Abstract:
Neuronal damage, in the form of both brain and spinal cord injuries, is one of the major causes of disability and death in young adults worldwide. One way to assess the direct damage occurring after a mechanical insult is the simulation of the neuronal cells functional deficits following the mechanical event. In this study, we use a coupled mechanical electrophysiological model with several free p…
▽ More
Neuronal damage, in the form of both brain and spinal cord injuries, is one of the major causes of disability and death in young adults worldwide. One way to assess the direct damage occurring after a mechanical insult is the simulation of the neuronal cells functional deficits following the mechanical event. In this study, we use a coupled mechanical electrophysiological model with several free parameters that are required to be calibrated against experimental results. The calibration is carried out by means of an evolutionary algorithm (differential evolution, DE) that needs to evaluate each configuration of parameters on six different damage cases, each of them taking several minutes to compute. To minimise the simulation time of the parameter tuning for the DE, the stretch of one unique fixed-diameter axon with a simplified triggering process is used to speed up the calculations. The model is then leveraged for the parameter optimization of the more realistic bundle of independent axons, an impractical configuration to run on a single processor computer. To this end, we have developed a parallel implementation based on OpenMP that runs on a multi-processor taking advantage of all the available computational power. The parallel DE algorithm obtains good results, outperforming the best effort achieved by published manual calibration, in a fraction of the time. While not being able to fully capture the experimental results, the resulting nerve model provides a complex averaging framework for nerve damage simulation able to simulate gradual axonal functional alteration in a bundle.
△ Less
Submitted 19 September, 2024;
originally announced September 2024.
-
STL: Still Tricky Logic (for System Validation, Even When Showing Your Work)
Authors:
Isabelle Hurley,
Rohan Paleja,
Ashley Suh,
Jaime D. Peña,
Ho Chit Siu
Abstract:
As learned control policies become increasingly common in autonomous systems, there is increasing need to ensure that they are interpretable and can be checked by human stakeholders. Formal specifications have been proposed as ways to produce human-interpretable policies for autonomous systems that can still be learned from examples. Previous work showed that despite claims of interpretability, hu…
▽ More
As learned control policies become increasingly common in autonomous systems, there is increasing need to ensure that they are interpretable and can be checked by human stakeholders. Formal specifications have been proposed as ways to produce human-interpretable policies for autonomous systems that can still be learned from examples. Previous work showed that despite claims of interpretability, humans are unable to use formal specifications presented in a variety of ways to validate even simple robot behaviors. This work uses active learning, a standard pedagogical method, to attempt to improve humans' ability to validate policies in signal temporal logic (STL). Results show that overall validation accuracy is not high, at $65\% \pm 15\%$ (mean $\pm$ standard deviation), and that the three conditions of no active learning, active learning, and active learning with feedback do not significantly differ from each other. Our results suggest that the utility of formal specifications for human interpretability is still unsupported but point to other avenues of development which may enable improvements in system validation.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Fast Convergence of Frank-Wolfe algorithms on polytopes
Authors:
Elias Wirth,
Javier Pena,
Sebastian Pokutta
Abstract:
We provide a template to derive convergence rates for the following popular versions of the Frank-Wolfe algorithm on polytopes: vanilla Frank-Wolfe, Frank-Wolfe with away steps, Frank-Wolfe with blended pairwise steps, and Frank-Wolfe with in-face directions. Our template shows how the convergence rates follow from two affine-invariant properties of the problem, namely, error bound and extended cu…
▽ More
We provide a template to derive convergence rates for the following popular versions of the Frank-Wolfe algorithm on polytopes: vanilla Frank-Wolfe, Frank-Wolfe with away steps, Frank-Wolfe with blended pairwise steps, and Frank-Wolfe with in-face directions. Our template shows how the convergence rates follow from two affine-invariant properties of the problem, namely, error bound and extended curvature. These properties depend solely on the polytope and objective function but not on any affine-dependent object like norms. For each one of the above algorithms, we derive rates of convergence ranging from sublinear to linear depending on the degree of the error bound.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Simple yet Sharp Sensitivity Analysis for Any Contrast Under Unmeasured Confounding
Authors:
Jose M. Peña
Abstract:
We extend our previous work on sensitivity analysis for the risk ratio and difference contrasts under unmeasured confounding to any contrast. We prove that the bounds produced are still arbitrarily sharp, i.e. practically attainable. We illustrate the usability of the bounds with real data.
We extend our previous work on sensitivity analysis for the risk ratio and difference contrasts under unmeasured confounding to any contrast. We prove that the bounds produced are still arbitrarily sharp, i.e. practically attainable. We illustrate the usability of the bounds with real data.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Why Would You Suggest That? Human Trust in Language Model Responses
Authors:
Manasi Sharma,
Ho Chit Siu,
Rohan Paleja,
Jaime D. Peña
Abstract:
The emergence of Large Language Models (LLMs) has revealed a growing need for human-AI collaboration, especially in creative decision-making scenarios where trust and reliance are paramount. Through human studies and model evaluations on the open-ended News Headline Generation task from the LaMP benchmark, we analyze how the framing and presence of explanations affect user trust and model performa…
▽ More
The emergence of Large Language Models (LLMs) has revealed a growing need for human-AI collaboration, especially in creative decision-making scenarios where trust and reliance are paramount. Through human studies and model evaluations on the open-ended News Headline Generation task from the LaMP benchmark, we analyze how the framing and presence of explanations affect user trust and model performance. Overall, we provide evidence that adding an explanation in the model response to justify its reasoning significantly increases self-reported user trust in the model when the user has the opportunity to compare various responses. Position and faithfulness of these explanations are also important factors. However, these gains disappear when users are shown responses independently, suggesting that humans trust all model responses, including deceptive ones, equitably when they are shown in isolation. Our findings urge future research to delve deeper into the nuanced evaluation of trust in human-machine teaming systems.
△ Less
Submitted 4 October, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.
-
Quasimetric spaces with few lines
Authors:
Guillermo Gamboa Quintero,
Martín Matamala,
Juan Pablo Peña
Abstract:
Chen and Chvátal conjectured in 2008 that in any finite metric space either there is a line containing all the points - a universal line -, or the number of lines is at least the number of points. This is a generalization of a classical result due to Erdős that says that a set of $n$ non-collinear points in the Euclidean plane defines at least $n$ different lines.
A line of a metric space with m…
▽ More
Chen and Chvátal conjectured in 2008 that in any finite metric space either there is a line containing all the points - a universal line -, or the number of lines is at least the number of points. This is a generalization of a classical result due to Erdős that says that a set of $n$ non-collinear points in the Euclidean plane defines at least $n$ different lines.
A line of a metric space with metric $ρ$ is defined in terms of a notion called the betweenness of the space which is the set of all triples $(x,z,y)$ such that $ρ(x,y)=ρ(x,z)+ρ(z,y)$.
In this work we prove that for each $n\geq 4$ there are $p_3(n)$ non isomorphic betweennesses arising from \emph{quasimetric} spaces with $n$ points, without universal lines and with exactly 3 lines, where $p_3(n)$ is the number of partitions of an integer $n$ into three parts. We also prove that for $n\geq 5$, there are $2p_3(n-1)$ non isomorphic betweennesses arising from quasimetric spaces on $n$ points, without universal lines and with exactly 4 lines. Here two betweennesses are isomorphic if they are isomorphic as relational structures.
None of the betweennesses mentioned above is metric which implies that Chen and Chvátal's conjecture is valid for metric spaces with at most five points.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Magnetocaloric effect for a $Q$-clock type system
Authors:
Michel Aguilera,
Sergio Pino-Alarcón,
Francisco J. Peña,
Eugenio E. Vogel,
Patricio Vargas
Abstract:
In this work, we study the magnetocaloric effect applied to a magnetic working substance corresponding to a square lattice of spins with $Q$ possible orientations known as the ``$Q$-state clock model" where for $Q\geq 5$, the systems present the famous Berezinskii-Kosterlitz-Thouless phase (BKT). Thermodynamic quantities are obtained in exact form for a small lattice size of $L \times L$ with…
▽ More
In this work, we study the magnetocaloric effect applied to a magnetic working substance corresponding to a square lattice of spins with $Q$ possible orientations known as the ``$Q$-state clock model" where for $Q\geq 5$, the systems present the famous Berezinskii-Kosterlitz-Thouless phase (BKT). Thermodynamic quantities are obtained in exact form for a small lattice size of $L \times L$ with $L=3$ and by the mean-field approximation and Monte Carlo simulations for $Q$ pairs between 2 and 8 with $L = 3, 8, 16, 32$ with free boundary conditions, and magnetic fields varying between $B = 0$ and $1$ in natural units of the system. By obtaining the entropy, it is possible to quantify the caloric effect through an isothermal process in which the external magnetic field on the spin system is varied. In particular, we find the values of $Q$ that maximize the effect depending on the lattice size and the magnetic phase transitions related to maximizing the caloric phenomena. These indicate that in a small lattice (up to $\sim 7\times 7$), when $Q\geq 5$, the transition that maximizes the effect is related to ferromagnetic to BKT type. In contrast, transitioning from BKT to paramagnetic type increases the system's caloric response when we work with a larger lattice size.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Effects of Magnetic Anisotropy on 3-Qubit Antiferromagnetic Thermal Machines
Authors:
Bastian Castorene,
Francisco J. Peña,
Ariel Norambuena,
Sergio E. Ulloa,
Cristobal Araya,
Patricio Vargas
Abstract:
This study investigates the anisotropic effects on a system of three qubits with chain and ring topology, described by the antiferromagnetic Heisenberg XXX model subjected to a homogeneous magnetic field. We explore the Stirling and Otto cycles and find that easy-axis anisotropy significantly enhances engine efficiency across all cases. At low temperatures, the ring configuration outperforms the c…
▽ More
This study investigates the anisotropic effects on a system of three qubits with chain and ring topology, described by the antiferromagnetic Heisenberg XXX model subjected to a homogeneous magnetic field. We explore the Stirling and Otto cycles and find that easy-axis anisotropy significantly enhances engine efficiency across all cases. At low temperatures, the ring configuration outperforms the chain on both work and efficiency during the Stirling cycle. Additionally, in both topologies, the Stirling cycle achieves Carnot efficiency with finite work at quantum critical points. In contrast, the quasistatic Otto engine also reaches Carnot efficiency at these points but yields no useful work. Notably, the Stirling cycle exhibits all thermal operational regimes engine, refrigerator, heater, and accelerator unlike the quasistatic Otto cycle, which functions only as an engine or refrigerator.
△ Less
Submitted 26 September, 2024; v1 submitted 20 May, 2024;
originally announced May 2024.
-
Magnonic Thermal Machines
Authors:
N. Vidal-Silva,
Francisco J. Peña,
Roberto E. Troncoso,
Patricio Vargas
Abstract:
We propose a magnon-based thermal machine in two-dimensional (2D) magnetic insulators. The thermodynamical cycles are engineered by exposing a magnon spin system to thermal baths at different temperatures and tuning the Dzyaloshinskii-Moriya (DM) interaction. We find for the Otto cycle that a thermal gas of magnons converts a fraction of heat into energy in the form of work, where the efficiency i…
▽ More
We propose a magnon-based thermal machine in two-dimensional (2D) magnetic insulators. The thermodynamical cycles are engineered by exposing a magnon spin system to thermal baths at different temperatures and tuning the Dzyaloshinskii-Moriya (DM) interaction. We find for the Otto cycle that a thermal gas of magnons converts a fraction of heat into energy in the form of work, where the efficiency is maximized for specific values of DM, reaching the corresponding Carnot efficiency. We witness a positive to negative net work transition during the cycle that marks the onset of a refrigerator-like behavior. The work produced by the magnonic heat engine enhances the magnon chemical potential. The last enables a spin accumulation that might result in the pumping of spin currents at the interfaces of metal-magnet heterostructures. Our work opens new possibilities for the efficient leverage of conventional two-dimensional magnets.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Deep Learning Based Event Reconstruction for Cyclotron Radiation Emission Spectroscopy
Authors:
A. Ashtari Esfahani,
S. Böser,
N. Buzinsky,
M. C. Carmona-Benitez,
R. Cervantes,
C. Claessens,
L. de Viveiros,
M. Fertl,
J. A. Formaggio,
J. K. Gaison,
L. Gladstone,
M. Grando,
M. Guigue,
J. Hartse,
K. M. Heeger,
X. Huyan,
A. M. Jones,
K. Kazkaz,
M. Li,
A. Lindman,
A. Marsteller,
C. Matthé,
R. Mohiuddin,
B. Monreal,
E. C. Morrison
, et al. (26 additional authors not shown)
Abstract:
The objective of the Cyclotron Radiation Emission Spectroscopy (CRES) technology is to build precise particle energy spectra. This is achieved by identifying the start frequencies of charged particle trajectories which, when exposed to an external magnetic field, leave semi-linear profiles (called tracks) in the time-frequency plane. Due to the need for excellent instrumental energy resolution in…
▽ More
The objective of the Cyclotron Radiation Emission Spectroscopy (CRES) technology is to build precise particle energy spectra. This is achieved by identifying the start frequencies of charged particle trajectories which, when exposed to an external magnetic field, leave semi-linear profiles (called tracks) in the time-frequency plane. Due to the need for excellent instrumental energy resolution in application, highly efficient and accurate track reconstruction methods are desired. Deep learning convolutional neural networks (CNNs) - particularly suited to deal with information-sparse data and which offer precise foreground localization - may be utilized to extract track properties from measured CRES signals (called events) with relative computational ease. In this work, we develop a novel machine learning based model which operates a CNN and a support vector machine in tandem to perform this reconstruction. A primary application of our method is shown on simulated CRES signals which mimic those of the Project 8 experiment - a novel effort to extract the unknown absolute neutrino mass value from a precise measurement of tritium $β^-$-decay energy spectrum. When compared to a point-clustering based technique used as a baseline, we show a relative gain of 24.1% in event reconstruction efficiency and comparable performance in accuracy of track parameter reconstruction.
△ Less
Submitted 5 January, 2024;
originally announced February 2024.
-
Combined matrices of almost strictly sign regular matrices
Authors:
Pedro Alonso,
Juan Manuel Peña,
María Luisa Serrano
Abstract:
The combined matrix is a very useful concept for many applications. Almost strictly sign regular (ASSR) matrices form an important structured class of matrices with two possible zero patterns, which are either type-I staircase or type-II staircase. We prove that, under an irreducibility condition, the pattern of zero and nonzero entries of an ASSR matrix is preserved by the corresponding combined…
▽ More
The combined matrix is a very useful concept for many applications. Almost strictly sign regular (ASSR) matrices form an important structured class of matrices with two possible zero patterns, which are either type-I staircase or type-II staircase. We prove that, under an irreducibility condition, the pattern of zero and nonzero entries of an ASSR matrix is preserved by the corresponding combined matrix. Without the irreducibility condition, it is proved that type-I and type-II staircases are still preserved. Illustrative numerical examples are included.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Almost strictly sign regular rectangular matrices
Authors:
P. Alonso,
J. M. Peña,
M. L. Serrano
Abstract:
Almost strictly sign regular matrices are sign regular matrices with a special zero pattern and whose nontrivial minors are nonzero. In this paper we provide several properties of almost strictly sign regular rectangular matrices and analyze their QR factorization.
Almost strictly sign regular matrices are sign regular matrices with a special zero pattern and whose nontrivial minors are nonzero. In this paper we provide several properties of almost strictly sign regular rectangular matrices and analyze their QR factorization.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Modeling of learning curves with applications to pos tagging
Authors:
Manuel Vilares Ferro,
Victor M. Darriba Bilbao,
Francisco J. Ribadas Pena
Abstract:
An algorithm to estimate the evolution of learning curves on the whole of a training data base, based on the results obtained from a portion and using a functional strategy, is introduced. We approximate iteratively the sought value at the desired time, independently of the learning technique used and once a point in the process, called prediction level, has been passed. The proposal proves to be…
▽ More
An algorithm to estimate the evolution of learning curves on the whole of a training data base, based on the results obtained from a portion and using a functional strategy, is introduced. We approximate iteratively the sought value at the desired time, independently of the learning technique used and once a point in the process, called prediction level, has been passed. The proposal proves to be formally correct with respect to our working hypotheses and includes a reliable proximity condition. This allows the user to fix a convergence threshold with respect to the accuracy finally achievable, which extends the concept of stopping criterion and seems to be effective even in the presence of distorting observations.
Our aim is to evaluate the training effort, supporting decision making in order to reduce the need for both human and computational resources during the learning process. The proposal is of interest in at least three operational procedures. The first is the anticipation of accuracy gain, with the purpose of measuring how much work is needed to achieve a certain degree of performance. The second relates the comparison of efficiency between systems at training time, with the objective of completing this task only for the one that best suits our requirements. The prediction of accuracy is also a valuable item of information for customizing systems, since we can estimate in advance the impact of settings on both the performance and the development costs. Using the generation of part-of-speech taggers as an example application, the experimental results are consistent with our expectations.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.
-
Early stopping by correlating online indicators in neural networks
Authors:
Manuel Vilares Ferro,
Yerai Doval Mosquera,
Francisco J. Ribadas Pena,
Victor M. Darriba Bilbao
Abstract:
In order to minimize the generalization error in neural networks, a novel technique to identify overfitting phenomena when training the learner is formally introduced. This enables support of a reliable and trustworthy early stopping condition, thus improving the predictive power of that type of modeling. Our proposal exploits the correlation over time in a collection of online indicators, namely…
▽ More
In order to minimize the generalization error in neural networks, a novel technique to identify overfitting phenomena when training the learner is formally introduced. This enables support of a reliable and trustworthy early stopping condition, thus improving the predictive power of that type of modeling. Our proposal exploits the correlation over time in a collection of online indicators, namely characteristic functions for indicating if a set of hypotheses are met, associated with a range of independent stopping conditions built from a canary judgment to evaluate the presence of overfitting. That way, we provide a formal basis for decision making in terms of interrupting the learning process.
As opposed to previous approaches focused on a single criterion, we take advantage of subsidiarities between independent assessments, thus seeking both a wider operating range and greater diagnostic reliability. With a view to illustrating the effectiveness of the halting condition described, we choose to work in the sphere of natural language processing, an operational continuum increasingly based on machine learning. As a case study, we focus on parser generation, one of the most demanding and complex tasks in the domain. The selection of cross-validation as a canary function enables an actual comparison with the most representative early stopping conditions based on overfitting identification, pointing to a promising start toward an optimal bias and variance control.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.
-
Deep Learning With DAGs
Authors:
Sourabh Balgi,
Adel Daoud,
Jose M. Peña,
Geoffrey T. Wodtke,
Jesse Zhou
Abstract:
Social science theories often postulate causal relationships among a set of variables or events. Although directed acyclic graphs (DAGs) are increasingly used to represent these theories, their full potential has not yet been realized in practice. As non-parametric causal models, DAGs require no assumptions about the functional form of the hypothesized relationships. Nevertheless, to simplify the…
▽ More
Social science theories often postulate causal relationships among a set of variables or events. Although directed acyclic graphs (DAGs) are increasingly used to represent these theories, their full potential has not yet been realized in practice. As non-parametric causal models, DAGs require no assumptions about the functional form of the hypothesized relationships. Nevertheless, to simplify the task of empirical evaluation, researchers tend to invoke such assumptions anyway, even though they are typically arbitrary and do not reflect any theoretical content or prior knowledge. Moreover, functional form assumptions can engender bias, whenever they fail to accurately capture the complexity of the causal system under investigation. In this article, we introduce causal-graphical normalizing flows (cGNFs), a novel approach to causal inference that leverages deep neural networks to empirically evaluate theories represented as DAGs. Unlike conventional approaches, cGNFs model the full joint distribution of the data according to a DAG supplied by the analyst, without relying on stringent assumptions about functional form. In this way, the method allows for flexible, semi-parametric estimation of any causal estimand that can be identified from the DAG, including total effects, conditional effects, direct and indirect effects, and path-specific effects. We illustrate the method with a reanalysis of Blau and Duncan's (1967) model of status attainment and Zhou's (2019) model of conditional versus controlled mobility. To facilitate adoption, we provide open-source software together with a series of online tutorials for implementing cGNFs. The article concludes with a discussion of current limitations and directions for future development.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
Duality of Hoffman constants
Authors:
Javier F. Pena,
Juan C. Vera,
Luis F. Zuluaga
Abstract:
Suppose $A\in \mathbb{R}^{m\times n}$ and consider the following canonical systems of inequalities defined by $A$: $$ \begin{array}{l} Ax=b\\ x \ge 0 \end{array} \qquad \text{ and }\qquad A^T y - c \le 0. $$ We establish some novel duality relationships between the Hoffman constants for the above constraint systems of linear inequalities provided some suitable Slater condition holds. The crux of o…
▽ More
Suppose $A\in \mathbb{R}^{m\times n}$ and consider the following canonical systems of inequalities defined by $A$: $$ \begin{array}{l} Ax=b\\ x \ge 0 \end{array} \qquad \text{ and }\qquad A^T y - c \le 0. $$ We establish some novel duality relationships between the Hoffman constants for the above constraint systems of linear inequalities provided some suitable Slater condition holds. The crux of our approach is a Hoffman duality inequality for polyhedral systems of constraints. The latter in turn yields an interesting duality identity between the Hoffman constants of the following box-constrained systems of inequalities: $$ \begin{array}{l} Ax=b\\ \ell \le x \le u \end{array}\qquad \text{ and }\qquad \ell \le A^T y - c \le u $$ for $\ell, u\in \mathbb{R}^n$ with $\ell < u.$
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Assessing the Unobserved: Enhancing Causal Inference in Sociology with Sensitivity Analysis
Authors:
Cheng Lin,
Jose M. Pena,
Adel Daoud
Abstract:
Explaining social events is a primary objective of applied data-driven sociology. To achieve that objective, many sociologists use statistical causal inference to identify causality using observational studies research context where the analyst does not control the data generating process. However, it is often challenging in observation studies to satisfy the unmeasured confounding assumption, nam…
▽ More
Explaining social events is a primary objective of applied data-driven sociology. To achieve that objective, many sociologists use statistical causal inference to identify causality using observational studies research context where the analyst does not control the data generating process. However, it is often challenging in observation studies to satisfy the unmeasured confounding assumption, namely, that there is no lurking third variable affecting the causal relationship of interest. In this article, we develop a framework enabling sociologists to employ a different strategy to enhance the quality of observational studies. Our framework builds on a surprisingly simple statistical approach, sensitivity analysis: a thought-experimental framework where the analyst imagines a lever, which they can pull for probing a variety of theoretically driven statistical magnitudes of posited unmeasured confounding which in turn distorts the causal effect of interest. By pulling that lever, the analyst can identify how strong an unmeasured confounder must be to wash away the estimated causal effect. Although each sensitivity analysis method requires its own assumptions, this sort of post-hoc analysis provides underutilized tools to bound causal quantities. Extending Lundberg et al, we develop a five-step approach to how applied sociological research can incorporate sensitivity analysis, empowering scholars to rejuvenate causal inference in observational studies.
△ Less
Submitted 23 June, 2024; v1 submitted 22 November, 2023;
originally announced November 2023.
-
Accelerated Affine-Invariant Convergence Rates of the Frank-Wolfe Algorithm with Open-Loop Step-Sizes
Authors:
Elias Wirth,
Javier Pena,
Sebastian Pokutta
Abstract:
Recent papers have shown that the Frank-Wolfe algorithm (FW) with open-loop step-sizes exhibits rates of convergence faster than the iconic $\mathcal{O}(t^{-1})$ rate. In particular, when the minimizer of a strongly convex function over a polytope lies in the relative interior of a feasible region face, the FW with open-loop step-sizes $η_t = \frac{\ell}{t+\ell}$ for…
▽ More
Recent papers have shown that the Frank-Wolfe algorithm (FW) with open-loop step-sizes exhibits rates of convergence faster than the iconic $\mathcal{O}(t^{-1})$ rate. In particular, when the minimizer of a strongly convex function over a polytope lies in the relative interior of a feasible region face, the FW with open-loop step-sizes $η_t = \frac{\ell}{t+\ell}$ for $\ell \in \mathbb{N}_{\geq 2}$ has accelerated convergence $\mathcal{O}(t^{-2})$ in contrast to the rate $Ω(t^{-1-ε})$ attainable with more complex line-search or short-step step-sizes. Given the relevance of this scenario in data science problems, research has grown to explore the settings enabling acceleration in open-loop FW. However, despite FW's well-known affine invariance, existing acceleration results for open-loop FW are affine-dependent. This paper remedies this gap in the literature by merging two recent research trajectories: affine invariance (Wirth et al., 2023b) and open-loop step-sizes (Pena, 2021). In particular, we extend all known non-affine-invariant convergence rates for FW with open-loop step-sizes to affine-invariant results.
△ Less
Submitted 11 October, 2023; v1 submitted 6 October, 2023;
originally announced October 2023.
-
Real-time Signal Detection for Cyclotron Radiation Emission Spectroscopy Measurements using Antenna Arrays
Authors:
A. Ashtari Esfahani,
S. Böser,
N. Buzinsky,
M. C. Carmona-Benitez,
C. Claessens,
L. de Viveiros,
M. Fertl,
J. A. Formaggio,
B. T. Foust,
J. K. Gaison,
M. Grando,
J. Hartse,
K. M. Heeger,
X. Huyan,
A. M. Jones,
B. J. P. Jones,
K. Kazkaz,
B. H. LaRoque,
M. Li,
A. Lindman,
A. Marsteller,
C. Matthé,
R. Mohiuddin,
B. Monreal,
B. Mucogllava
, et al. (26 additional authors not shown)
Abstract:
Cyclotron Radiation Emission Spectroscopy (CRES) is a technique for precision measurement of the energies of charged particles, which is being developed by the Project 8 Collaboration to measure the neutrino mass using tritium beta-decay spectroscopy. Project 8 seeks to use the CRES technique to measure the neutrino mass with a sensitivity of 40~meV, requiring a large supply of tritium atoms store…
▽ More
Cyclotron Radiation Emission Spectroscopy (CRES) is a technique for precision measurement of the energies of charged particles, which is being developed by the Project 8 Collaboration to measure the neutrino mass using tritium beta-decay spectroscopy. Project 8 seeks to use the CRES technique to measure the neutrino mass with a sensitivity of 40~meV, requiring a large supply of tritium atoms stored in a multi-cubic meter detector volume. Antenna arrays are one potential technology compatible with an experiment of this scale, but the capability of an antenna-based CRES experiment to measure the neutrino mass depends on the efficiency of the signal detection algorithms. In this paper, we develop efficiency models for three signal detection algorithms and compare them using simulations from a prototype antenna-based CRES experiment as a case-study. The algorithms include a power threshold, a matched filter template bank, and a neural network based machine learning approach, which are analyzed in terms of their average detection efficiency and relative computational cost. It is found that significant improvements in detection efficiency and, therefore, neutrino mass sensitivity are achievable, with only a moderate increase in computation cost, by utilizing either the matched filter or machine learning approach in place of a power threshold, which is the baseline signal detection algorithm used in previous CRES experiments by Project 8.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
On the Probability of Immunity
Authors:
Jose M. Peña
Abstract:
This work is devoted to the study of the probability of immunity, i.e. the effect occurs whether exposed or not. We derive necessary and sufficient conditions for non-immunity and $ε$-bounded immunity, i.e. the probability of immunity is zero and $ε$-bounded, respectively. The former allows us to estimate the probability of benefit (i.e., the effect occurs if and only if exposed) from a randomized…
▽ More
This work is devoted to the study of the probability of immunity, i.e. the effect occurs whether exposed or not. We derive necessary and sufficient conditions for non-immunity and $ε$-bounded immunity, i.e. the probability of immunity is zero and $ε$-bounded, respectively. The former allows us to estimate the probability of benefit (i.e., the effect occurs if and only if exposed) from a randomized controlled trial, and the latter allows us to produce bounds of the probability of benefit that are tighter than the existing ones. We also introduce the concept of indirect immunity (i.e., through a mediator) and repeat our previous analysis for it. Finally, we propose a method for sensitivity analysis of the probability of immunity under unmeasured confounding.
△ Less
Submitted 11 October, 2023; v1 submitted 21 September, 2023;
originally announced September 2023.
-
Boundedness for proper conflict-free and odd colorings
Authors:
Andrea Jiménez,
Kolja Knauer,
Carla Negri Lintzmayer,
Martín Matamala,
Juan Pablo Peña,
Daniel A. Quiroz,
Maycon Sambinelli,
Yoshiko Wakabayashi,
Weiqiang Yu,
José Zamora
Abstract:
The proper conflict-free chromatic number, $χ_{pcf}(G)$, of a graph $G$ is the least $k$ such that $G$ has a proper $k$-coloring in which for each non-isolated vertex there is a color appearing exactly once among its neighbors. The proper odd chromatic number, $χ_{o}(G)$, of $G$ is the least $k$ such that $G$ has a proper coloring in which for every non-isolated vertex there is a color appearing a…
▽ More
The proper conflict-free chromatic number, $χ_{pcf}(G)$, of a graph $G$ is the least $k$ such that $G$ has a proper $k$-coloring in which for each non-isolated vertex there is a color appearing exactly once among its neighbors. The proper odd chromatic number, $χ_{o}(G)$, of $G$ is the least $k$ such that $G$ has a proper coloring in which for every non-isolated vertex there is a color appearing an odd number of times among its neighbors. We say that a graph class $\mathcal{G}$ is $χ_{pcf}$-bounded ($χ_{o}$-bounded) if there is a function $f$ such that $χ_{pcf}(G) \leq f(χ(G))$ ($χ_{o}(G) \leq f(χ(G))$) for every $G \in \mathcal{G}$. Caro et al. (2022) asked for classes that are linearly $χ_{pcf}$-bounded ($χ_{pcf}$-bounded), and as a starting point, they showed that every claw-free graph $G$ satisfies $χ_{pcf}(G) \le 2Δ(G)+1$, which implies $χ_{pcf}(G) \le 4χ(G)+1$.
In this paper, we improve the bound for claw-free graphs to a nearly tight bound by showing that such a graph $G$ satisfies $χ_{pcf}(G) \le Δ(G)+6$, and even $χ_{pcf}(G) \le Δ(G)+4$ if it is a quasi-line graph. These results also give evidence for a conjecture by Caro et al. Moreover, we show that convex-round graphs and permutation graphs are linearly $χ_{pcf}$-bounded. For these last two results, we prove a lemma that reduces the problem of deciding if a hereditary class is linearly $χ_{pcf}$-bounded to deciding if the bipartite graphs in the class are $χ_{pcf}$-bounded by an absolute constant. This lemma complements a theorem of Liu (2022) and motivates us to study boundedness in bipartite graphs. In particular, we show that biconvex bipartite graphs are $χ_{pcf}$-bounded while convex bipartite graphs are not even $χ_o$-bounded, and exhibit a class of bipartite circle graphs that is linearly $χ_o$-bounded but not $χ_{pcf}$-bounded.
△ Less
Submitted 9 February, 2024; v1 submitted 31 July, 2023;
originally announced August 2023.
-
ACC Saturator: Automatic Kernel Optimization for Directive-Based GPU Code
Authors:
Kazuaki Matsumura,
Simon Garcia De Gonzalo,
Antonio J. Peña
Abstract:
Automatic code optimization is a complex process that typically involves the application of multiple discrete algorithms that modify the program structure irreversibly. However, the design of these algorithms is often monolithic, and they require repetitive implementation to perform similar analyses due to the lack of cooperation. To address this issue, modern optimization techniques, such as equa…
▽ More
Automatic code optimization is a complex process that typically involves the application of multiple discrete algorithms that modify the program structure irreversibly. However, the design of these algorithms is often monolithic, and they require repetitive implementation to perform similar analyses due to the lack of cooperation. To address this issue, modern optimization techniques, such as equality saturation, allow for exhaustive term rewriting at various levels of inputs, thereby simplifying compiler design.
In this paper, we propose equality saturation to optimize sequential codes utilized in directive-based programming for GPUs. Our approach realizes less computation, less memory access, and high memory throughput simultaneously. Our fully-automated framework constructs single-assignment forms from inputs to be entirely rewritten while keeping dependencies and extracts optimal cases. Through practical benchmarks, we demonstrate a significant performance improvement on several compilers. Furthermore, we highlight the advantages of computational reordering and emphasize the significance of memory-access order for modern GPUs.
△ Less
Submitted 17 September, 2024; v1 submitted 22 June, 2023;
originally announced June 2023.
-
Alternative Measures of Direct and Indirect Effects
Authors:
Jose M. Peña
Abstract:
There are a number of measures of direct and indirect effects in the literature. They are suitable in some cases and unsuitable in others. We describe a case where the existing measures are unsuitable and propose new suitable ones. We also show that the new measures can partially handle unmeasured treatment-outcome confounding, and bound long-term effects by combining experimental and observationa…
▽ More
There are a number of measures of direct and indirect effects in the literature. They are suitable in some cases and unsuitable in others. We describe a case where the existing measures are unsuitable and propose new suitable ones. We also show that the new measures can partially handle unmeasured treatment-outcome confounding, and bound long-term effects by combining experimental and observational data.
△ Less
Submitted 11 October, 2023; v1 submitted 2 June, 2023;
originally announced June 2023.
-
DeepAqua: Self-Supervised Semantic Segmentation of Wetland Surface Water Extent with SAR Images using Knowledge Distillation
Authors:
Francisco J. Peña,
Clara Hübinger,
Amir H. Payberah,
Fernando Jaramillo
Abstract:
Deep learning and remote sensing techniques have significantly advanced water monitoring abilities; however, the need for annotated data remains a challenge. This is particularly problematic in wetland detection, where water extent varies over time and space, demanding multiple annotations for the same area. In this paper, we present DeepAqua, a self-supervised deep learning model that leverages k…
▽ More
Deep learning and remote sensing techniques have significantly advanced water monitoring abilities; however, the need for annotated data remains a challenge. This is particularly problematic in wetland detection, where water extent varies over time and space, demanding multiple annotations for the same area. In this paper, we present DeepAqua, a self-supervised deep learning model that leverages knowledge distillation (a.k.a. teacher-student model) to eliminate the need for manual annotations during the training phase. We utilize the Normalized Difference Water Index (NDWI) as a teacher model to train a Convolutional Neural Network (CNN) for segmenting water from Synthetic Aperture Radar (SAR) images, and to train the student model, we exploit cases where optical- and radar-based water masks coincide, enabling the detection of both open and vegetated water surfaces. DeepAqua represents a significant advancement in computer vision techniques by effectively training semantic segmentation models without any manually annotated data. Experimental results show that DeepAqua outperforms other unsupervised methods by improving accuracy by 7%, Intersection Over Union by 27%, and F1 score by 14%. This approach offers a practical solution for monitoring wetland water extent changes without needing ground truth data, making it highly adaptable and scalable for wetland conservation efforts.
△ Less
Submitted 20 September, 2023; v1 submitted 2 May, 2023;
originally announced May 2023.
-
Cyclotron Radiation Emission Spectroscopy of Electrons from Tritium Beta Decay and $^{83\rm m}$Kr Internal Conversion
Authors:
Project 8 Collaboration,
A. Ashtari Esfahani,
S. Böser,
N. Buzinsky,
M. C. Carmona-Benitez,
C. Claessens,
L. de Viveiros,
P. J. Doe,
M. Fertl,
J. A. Formaggio,
J. K. Gaison,
L. Gladstone,
M. Guigue,
J. Hartse,
K. M. Heeger,
X. Huyan,
A. M. Jones,
K. Kazkaz,
B. H. LaRoque,
M. Li,
A. Lindman,
E. Machado,
A. Marsteller,
C. Matthé,
R. Mohiuddin
, et al. (32 additional authors not shown)
Abstract:
Project 8 has developed a novel technique, Cyclotron Radiation Emission Spectroscopy (CRES), for direct neutrino mass measurements. A CRES-based experiment on the beta spectrum of tritium has been carried out in a small-volume apparatus. We provide a detailed account of the experiment, focusing on systematic effects and analysis techniques. In a Bayesian (frequentist) analysis, we measure the trit…
▽ More
Project 8 has developed a novel technique, Cyclotron Radiation Emission Spectroscopy (CRES), for direct neutrino mass measurements. A CRES-based experiment on the beta spectrum of tritium has been carried out in a small-volume apparatus. We provide a detailed account of the experiment, focusing on systematic effects and analysis techniques. In a Bayesian (frequentist) analysis, we measure the tritium endpoint as $18553^{+18}_{-19}$ ($18548^{+19}_{-19}$) eV and set upper limits of 155 (152) eV (90% C.L.) on the neutrino mass. No background events are observed beyond the endpoint in 82 days of running. We also demonstrate an energy resolution of $1.66\pm0.19$ eV in a resolution-optimized magnetic trap configuration by measuring $^{83\rm m}$Kr 17.8-keV internal-conversion electrons. These measurements establish CRES as a low-background, high-resolution technique with the potential to advance neutrino mass sensitivity.
△ Less
Submitted 23 December, 2023; v1 submitted 21 March, 2023;
originally announced March 2023.
-
Accurate GW frontier orbital energies of 134 kilo molecules
Authors:
Artem Fediai,
Patrick Reiser,
Jorge Enrique Olivares Peña,
Pascal Friederich,
Wolfgang Wenzel
Abstract:
The QM9 dataset [Scientific Data, Vol. 1, 140022 (2014)] became a standard dataset to benchmark machine learning methods, especially on molecular graphs. It contains geometries as well as multiple computed molecular properties of 133,885 compounds at B3LYP/6-31G(2df,p) level of theory, including frontier orbitals (HOMO and LUMO) energies. However, the accuracy of HOMO/LUMO predictions from density…
▽ More
The QM9 dataset [Scientific Data, Vol. 1, 140022 (2014)] became a standard dataset to benchmark machine learning methods, especially on molecular graphs. It contains geometries as well as multiple computed molecular properties of 133,885 compounds at B3LYP/6-31G(2df,p) level of theory, including frontier orbitals (HOMO and LUMO) energies. However, the accuracy of HOMO/LUMO predictions from density functional theory, including hybrid methods such as B3LYP, is limited for many applications. In contrast, the GW method significantly improves HOMO/LUMO prediction accuracy, with mean unsigned errors in the GW100 benchmark dataset of 100 meV. In this work, we present a new dataset of HOMO/LUMO energies for the QM9 compounds, computed using the GW method. This database may serve as a benchmark of HOMO/LUMO prediction, delta-learning, and transfer learning, particularly for larger molecules where GW is the most accurate but still numerically feasible method. We expect this dataset to enable the development of more accurate machine learning models for predicting molecular properties
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Bounding the Probabilities of Benefit and Harm Through Sensitivity Parameters and Proxies
Authors:
Jose M. Peña
Abstract:
We present two methods for bounding the probabilities of benefit and harm under unmeasured confounding. The first method computes the (upper or lower) bound of either probability as a function of the observed data distribution and two intuitive sensitivity parameters which, then, can be presented to the analyst as a 2-D plot to assist her in decision making. The second method assumes the existence…
▽ More
We present two methods for bounding the probabilities of benefit and harm under unmeasured confounding. The first method computes the (upper or lower) bound of either probability as a function of the observed data distribution and two intuitive sensitivity parameters which, then, can be presented to the analyst as a 2-D plot to assist her in decision making. The second method assumes the existence of a measured nondifferential proxy (i.e., direct effect) of the unmeasured confounder. Using this proxy, tighter bounds than the existing ones can be derived from just the observed data distribution.
△ Less
Submitted 6 August, 2023; v1 submitted 8 March, 2023;
originally announced March 2023.
-
Q-ball-like solitons on the M2-brane with worldvolume fluxes
Authors:
Pedro García,
Maria Pilar Garcia del Moral,
Joselen M. Peña,
Reginaldo Prado-Fuentes
Abstract:
In this paper we obtain a family of analytic solutions to the nonlinear partial differential equations that describe the dynamics of the bosonic part of the mass operator of a M2-brane compactified on $M_9\times T^2$ in the LCG with worldvolume fluxes. Those fluxes can be induced by a constant and quantized supergravity 3-form. This sector of the theory, at supersymmetric level, has the interestin…
▽ More
In this paper we obtain a family of analytic solutions to the nonlinear partial differential equations that describe the dynamics of the bosonic part of the mass operator of a M2-brane compactified on $M_9\times T^2$ in the LCG with worldvolume fluxes. Those fluxes can be induced by a constant and quantized supergravity 3-form. This sector of the theory, at supersymmetric level, has the interesting property of having a discrete spectrum. We have focused on the characterization of Q-ball-like (QBL) solitons on the M2-brane with worldvolume fluxes. Two scenarios are analysed: one in which the system is isotropic and the other anisotropic. In the isotropic case, we obtain analytic families of string-like solutions to the membrane equations of motion in the presence of a non-vanishing symplectic gauge field that satisfy all constraints. We explicitly show a localised family of QBL solutions. It is demonstrated that although the solutions generally exhibit dispersion, they also allow for dispersion-free solutions. In the non-isotropic case, we obtain full-fledged membrane QBL solutions by numerical methods. We characterize some other properties of the solutions found. The dynamics of the QBL solutions are also encountered. We analyze the Lorentz boosts and Galilean transformations. Since we work in the Light Cone Gauge, the Lorentz transformed solutions are not automatically solutions, rather some extra conditions must be imposed. Only a subset of the solutions remain. We discuss some examples. The QBL solitons of the M2-brane that have been discovered contain an interaction term between the Noether charge of the Q-ball and the topological monopole charge associated with the worldvolume flux. The monopole charge increases the stability of the analytic solutions against fission...
△ Less
Submitted 3 March, 2024; v1 submitted 23 February, 2023;
originally announced February 2023.
-
Enhanced Efficiency at Maximum Power in a Fock-Darwin Model Quantum Dot Engine
Authors:
Francisco J. Peña,
Nathan M. Myers,
Daniel Órdenes,
Francisco Albarrán-Arriagada,
Patricio Vargas
Abstract:
We study the performance of an endoreversible magnetic Otto cycle with a working substance composed of a single quantum dot described using the well-known Fock-Darwin model. We find that tuning the intensity of the parabolic trap (geometrical confinement) impacts the proposed cycle's performance, quantified by the power, work, efficiency, and parameter region where the cycle operates as an engine.…
▽ More
We study the performance of an endoreversible magnetic Otto cycle with a working substance composed of a single quantum dot described using the well-known Fock-Darwin model. We find that tuning the intensity of the parabolic trap (geometrical confinement) impacts the proposed cycle's performance, quantified by the power, work, efficiency, and parameter region where the cycle operates as an engine. We demonstrate that a parameter region exists where the efficiency at maximum output power exceeds the Curzon-Ahlborn efficiency, the efficiency at maximum power achieved by a classical working substance.
△ Less
Submitted 9 February, 2023;
originally announced February 2023.
-
A Modified CTGAN-Plus-Features Based Method for Optimal Asset Allocation
Authors:
José-Manuel Peña,
Fernando Suárez,
Omar Larré,
Domingo Ramírez,
Arturo Cifuentes
Abstract:
We propose a new approach to portfolio optimization that utilizes a unique combination of synthetic data generation and a CVaR-constraint. We formulate the portfolio optimization problem as an asset allocation problem in which each asset class is accessed through a passive (index) fund. The asset-class weights are determined by solving an optimization problem which includes a CVaR-constraint. The…
▽ More
We propose a new approach to portfolio optimization that utilizes a unique combination of synthetic data generation and a CVaR-constraint. We formulate the portfolio optimization problem as an asset allocation problem in which each asset class is accessed through a passive (index) fund. The asset-class weights are determined by solving an optimization problem which includes a CVaR-constraint. The optimization is carried out by means of a Modified CTGAN algorithm which incorporates features (contextual information) and is used to generate synthetic return scenarios, which, in turn, are fed into the optimization engine. For contextual information we rely on several points along the U.S. Treasury yield curve. The merits of this approach are demonstrated with an example based on ten asset classes (covering stocks, bonds, and commodities) over a fourteen-and-half year period (January 2008-June 2022). We also show that the synthetic generation process is able to capture well the key characteristics of the original data, and the optimization scheme results in portfolios that exhibit satisfactory out-of-sample performance. We also show that this approach outperforms the conventional equal-weights (1/N) asset allocation strategy and other optimization formulations based on historical data only.
△ Less
Submitted 15 May, 2024; v1 submitted 4 February, 2023;
originally announced February 2023.
-
An easily computable upper bound on the Hoffman constant for homogeneous inequality systems
Authors:
Javier Peña
Abstract:
Let $A\in \mathbb{R}^{m\times n}\setminus \{0\}$ and $P:=\{x:Ax\le 0\}$. This paper provides a procedure to compute an upper bound on the following homogeneous Hoffman constant: \[ H_0(A) := \sup_{u\in \mathbb{R}^n \setminus P} \frac{\text{dist}(u,P)}{\text{dist}(Au, \mathbb{R}^m_-)}. \] In sharp contrast to the intractability of computing more general Hoffman constants, the procedure described in…
▽ More
Let $A\in \mathbb{R}^{m\times n}\setminus \{0\}$ and $P:=\{x:Ax\le 0\}$. This paper provides a procedure to compute an upper bound on the following homogeneous Hoffman constant: \[ H_0(A) := \sup_{u\in \mathbb{R}^n \setminus P} \frac{\text{dist}(u,P)}{\text{dist}(Au, \mathbb{R}^m_-)}. \] In sharp contrast to the intractability of computing more general Hoffman constants, the procedure described in this paper is entirely tractable and easily implementable.
△ Less
Submitted 14 July, 2023; v1 submitted 4 February, 2023;
originally announced February 2023.
-
A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code
Authors:
Kazuaki Matsumura,
Simon Garcia De Gonzalo,
Antonio J. Peña
Abstract:
Various kinds of applications take advantage of GPUs through automation tools that attempt to automatically exploit the available performance of the GPU's parallel architecture. Directive-based programming models, such as OpenACC, are one such method that easily enables parallel computing by just adhering code annotations to code loops. Such abstract models, however, often prevent programmers from…
▽ More
Various kinds of applications take advantage of GPUs through automation tools that attempt to automatically exploit the available performance of the GPU's parallel architecture. Directive-based programming models, such as OpenACC, are one such method that easily enables parallel computing by just adhering code annotations to code loops. Such abstract models, however, often prevent programmers from making additional low-level optimizations to take advantage of the advanced architectural features of GPUs because the actual generated computation is hidden from the application developer.
This paper describes and implements a novel flexible optimization technique that operates by inserting a code emulator phase to the tail-end of the compilation pipeline. Our tool emulates the generated code using symbolic analysis by substituting dynamic information and thus allowing for further low-level code optimizations to be applied. We implement our tool to support both CUDA and OpenACC directives as the frontend of the compilation pipeline, thus enabling low-level GPU optimizations for OpenACC that were not previously possible. We demonstrate the capabilities of our tool by automating warp-level shuffle instructions that are difficult to use by even advanced GPU programmers. Lastly, evaluating our tool with a benchmark suite and complex application code, we provide a detailed study to assess the benefits of shuffle instructions across four generations of GPU architectures.
△ Less
Submitted 26 January, 2023;
originally announced January 2023.
-
Teaching labs for blind students: equipment to measure the inertia of simple objects
Authors:
A. Lisboa,
Francisco J. Peña
Abstract:
This article explains and illustrates the design of a laboratory experience for blind students to measure the inertia of simple objects, in this case, that of a disc around its axis of symmetry. Our adaptation consisted in modifying the data collection process, where we used an open-source electronic platform to convert visual signals into acoustic signals. This allows one of the blind students at…
▽ More
This article explains and illustrates the design of a laboratory experience for blind students to measure the inertia of simple objects, in this case, that of a disc around its axis of symmetry. Our adaptation consisted in modifying the data collection process, where we used an open-source electronic platform to convert visual signals into acoustic signals. This allows one of the blind students at our University to participate simultaneously as their classmates in the laboratory session corresponding to the mechanics unit of a standard engineering course.
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
SYNCA: A Synthetic Cyclotron Antenna for the Project 8 Collaboration
Authors:
A. Ashtari Esfahani,
S. Böser,
N. Buzinsky,
M. C. Carmona-Benitez,
C. Claessens,
L. de Viveiros,
M. Fertl,
J. A. Formaggio,
L. Gladstone,
M. Grando,
J. Hartse,
K. M. Heeger,
X. Huyan,
A. M. Jones,
K. Kazkaz,
M. Li,
A. Lindman,
C. Matthé,
R. Mohiuddin,
B. Monreal,
R. Mueller,
J. A. Nikkel,
E. Novitski,
N. S. Oblath,
J. I. Peña
, et al. (20 additional authors not shown)
Abstract:
Cyclotron Radiation Emission Spectroscopy (CRES) is a technique for measuring the kinetic energy of charged particles through a precision measurement of the frequency of the cyclotron radiation generated by the particle's motion in a magnetic field. The Project 8 collaboration is developing a next-generation neutrino mass measurement experiment based on CRES. One approach is to use a phased antenn…
▽ More
Cyclotron Radiation Emission Spectroscopy (CRES) is a technique for measuring the kinetic energy of charged particles through a precision measurement of the frequency of the cyclotron radiation generated by the particle's motion in a magnetic field. The Project 8 collaboration is developing a next-generation neutrino mass measurement experiment based on CRES. One approach is to use a phased antenna array, which surrounds a volume of tritium gas, to detect and measure the cyclotron radiation of the resulting $β$-decay electrons. To validate the feasibility of this method, Project 8 has designed a test stand to benchmark the performance of an antenna array at reconstructing signals that mimic those of genuine CRES events. To generate synthetic CRES events, a novel probe antenna has been developed, which emits radiation with characteristics similar to the cyclotron radiation produced by charged particles in magnetic fields. This paper outlines the design, construction, and characterization of this Synthetic Cyclotron Antenna (SYNCA). Furthermore, we perform a series of measurements that use the SYNCA to test the position reconstruction capabilities of the digital beamforming reconstruction technique. We find that the SYNCA produces radiation with characteristics closely matching those expected for cyclotron radiation and reproduces experimentally the phenomenology of digital beamforming simulations of true CRES signals.
△ Less
Submitted 15 December, 2022;
originally announced December 2022.
-
Tritium Beta Spectrum and Neutrino Mass Limit from Cyclotron Radiation Emission Spectroscopy
Authors:
Project 8 Collaboration,
A. Ashtari Esfahani,
S. Böser,
N. Buzinsky,
M. C. Carmona-Benitez,
C. Claessens,
L. de Viveiros,
P. J. Doe,
M. Fertl,
J. A. Formaggio,
J. K. Gaison,
L. Gladstone,
M. Grando,
M. Guigue,
J. Hartse,
K. M. Heeger,
X. Huyan,
J. Johnston,
A. M. Jones,
K. Kazkaz,
B. H. LaRoque,
M. Li,
A. Lindman,
E. Machado,
A. Marsteller
, et al. (34 additional authors not shown)
Abstract:
The absolute scale of the neutrino mass plays a critical role in physics at every scale, from the particle to the cosmological. Measurements of the tritium endpoint spectrum have provided the most precise direct limit on the neutrino mass scale. In this Letter, we present advances by Project 8 to the Cyclotron Radiation Emission Spectroscopy (CRES) technique culminating in the first frequency-base…
▽ More
The absolute scale of the neutrino mass plays a critical role in physics at every scale, from the particle to the cosmological. Measurements of the tritium endpoint spectrum have provided the most precise direct limit on the neutrino mass scale. In this Letter, we present advances by Project 8 to the Cyclotron Radiation Emission Spectroscopy (CRES) technique culminating in the first frequency-based neutrino mass limit. With only a cm$^3$-scale physical detection volume, a limit of $m_β{<}$155 eV ($152$ eV) is extracted from the background-free measurement of the continuous tritium beta spectrum in a Bayesian (frequentist) analysis. Using $^{83{\rm m}}$Kr calibration data, an improved resolution of 1.66${\pm}$0.19 eV (FWHM) is measured, the detector response model is validated, and the efficiency is characterized over the multi-keV tritium analysis window. These measurements establish the potential of CRES for a high-sensitivity next-generation direct neutrino mass experiment featuring low background and high resolution.
△ Less
Submitted 17 March, 2023; v1 submitted 9 December, 2022;
originally announced December 2022.
-
Multilayer Graphene as an Endoreversible Otto Engine
Authors:
Nathan M Myers,
Francisco J. Peña,
Natalia Cortés,
Patricio Vargas
Abstract:
Graphene is perhaps the most prominent "Dirac material," a class of systems whose electronic structure gives rise to charge carriers that behave as relativistic fermions. In multilayer graphene several crystal sheets are stacked such that the honeycomb lattice of each layer is displaced along one of the lattice edges. When subject to an external magnetic field, the scaling of the multilayer energy…
▽ More
Graphene is perhaps the most prominent "Dirac material," a class of systems whose electronic structure gives rise to charge carriers that behave as relativistic fermions. In multilayer graphene several crystal sheets are stacked such that the honeycomb lattice of each layer is displaced along one of the lattice edges. When subject to an external magnetic field, the scaling of the multilayer energy spectrum with the magnetic field, and thus the system's thermodynamic behavior, depends strongly on the number of layers. With this in mind, we examine the performance of a finite-time endoreversible Otto cycle with multilayer graphene as its working medium. We show that there exists a simple relationship between the engine efficiency and the number of layers, and that the efficiency at maximum power can exceed that of a classical endoreversible Otto cycle.
△ Less
Submitted 16 December, 2022; v1 submitted 6 December, 2022;
originally announced December 2022.
-
A validation study of normoglycemia and dysglycemia indices as a diabetes risk model
Authors:
Paola Vargas,
Miguel Angel Moreles,
Joaquin Peña,
Adriana Monroy
Abstract:
In this work, we test the performance of Peak glucose concentration ($A$) and average of glucose removal rates ($α$), as normoglycemia and dysglycemia indices on a population monitored at the Mexico General Hospital between the years 2017 - 2019. A total of 1911 volunteer patients at the Mexico General Hospital are considered. 1282 female patients age ranging from 17 to 80 years old, and 629 male…
▽ More
In this work, we test the performance of Peak glucose concentration ($A$) and average of glucose removal rates ($α$), as normoglycemia and dysglycemia indices on a population monitored at the Mexico General Hospital between the years 2017 - 2019. A total of 1911 volunteer patients at the Mexico General Hospital are considered. 1282 female patients age ranging from 17 to 80 years old, and 629 male patients age ranging from 18 to 79 years old. For each volunteer, OGTT data is gathered and indices are estimated in Ackerman's model. A binary separation of normoglycemic and disglycemic patients using a Support Vector Machine with a linear kernel is carried out. Classification indices are successful for 83\%. Population clusters on diabetic conditions and progression from Normoglycemic to T2DM may be concluded. The classification indices, $A$ and $α$ may be regarded as patient's indices and used to detect diabetes risk. Also, criteria for the applicability of glucose-insulin regulation models are introduced. The performance of Ackerman's model is shown.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Bidiagonal Decompositions of Vandermonde-Type Matrices of Arbitrary Rank
Authors:
Jorge Delgado,
Plamen Koev,
Ana Marco,
Jose-Javier Martinez,
Juan Manuel Pena,
Per-Olof Persson,
Steven Spasov
Abstract:
We present a method to derive new explicit expressions for bidiagonal decompositions of Vandermonde and related matrices such as the (q-, h-) Bernstein-Vandermonde ones, among others. These results generalize the existing expressions for nonsingular matrices to matrices of arbitrary rank. For totally nonnegative matrices of the above classes, the new decompositions can be computed efficiently and…
▽ More
We present a method to derive new explicit expressions for bidiagonal decompositions of Vandermonde and related matrices such as the (q-, h-) Bernstein-Vandermonde ones, among others. These results generalize the existing expressions for nonsingular matrices to matrices of arbitrary rank. For totally nonnegative matrices of the above classes, the new decompositions can be computed efficiently and to high relative accuracy componentwise in floating point arithmetic. In turn, matrix computations (e.g., eigenvalue computation) can also be performed efficiently and to high relative accuracy.
△ Less
Submitted 18 October, 2022;
originally announced October 2022.
-
$ρ$-GNF: A Copula-based Sensitivity Analysis to Unobserved Confounding Using Normalizing Flows
Authors:
Sourabh Balgi,
Jose M. Peña,
Adel Daoud
Abstract:
We propose a novel sensitivity analysis to unobserved confounding in observational studies using copulas and normalizing flows. Using the idea of interventional equivalence of structural causal models, we develop $ρ$-GNF ($ρ$-graphical normalizing flow), where $ρ{\in}[-1,+1]$ is a bounded sensitivity parameter. This parameter represents the back-door non-causal association due to unobserved confou…
▽ More
We propose a novel sensitivity analysis to unobserved confounding in observational studies using copulas and normalizing flows. Using the idea of interventional equivalence of structural causal models, we develop $ρ$-GNF ($ρ$-graphical normalizing flow), where $ρ{\in}[-1,+1]$ is a bounded sensitivity parameter. This parameter represents the back-door non-causal association due to unobserved confounding, and which is encoded with a Gaussian copula. In other words, the $ρ$-GNF enables scholars to estimate the average causal effect (ACE) as a function of $ρ$, while accounting for various assumed strengths of the unobserved confounding. The output of the $ρ$-GNF is what we denote as the $ρ_{curve}$ that provides the bounds for the ACE given an interval of assumed $ρ$ values. In particular, the $ρ_{curve}$ enables scholars to identify the confounding strength required to nullify the ACE, similar to other sensitivity analysis methods (e.g., the E-value). Leveraging on experiments from simulated and real-world data, we show the benefits of $ρ$-GNF. One benefit is that the $ρ$-GNF uses a Gaussian copula to encode the distribution of the unobserved causes, which is commonly used in many applied settings. This distributional assumption produces narrower ACE bounds compared to other popular sensitivity analysis methods.
△ Less
Submitted 22 August, 2024; v1 submitted 15 September, 2022;
originally announced September 2022.
-
Colors of Irregular Satellites of Saturn with DECam
Authors:
J. Peña,
C. Fuentes
Abstract:
We report g-r and r-i new colors for 21 Saturn Irregular Satellites, among them, 4 previously unreported. This is the highest number of Saturn Irregular satellites reported in a single survey. These satellites were measured by "stacking" their observations to increase their signal without trailing. This work describes a novel processing algorithm that enables the detection of faint sources under s…
▽ More
We report g-r and r-i new colors for 21 Saturn Irregular Satellites, among them, 4 previously unreported. This is the highest number of Saturn Irregular satellites reported in a single survey. These satellites were measured by "stacking" their observations to increase their signal without trailing. This work describes a novel processing algorithm that enables the detection of faint sources under significant background noise and in front of a severely crowded field.
Our survey shows these new color measurements of Saturn Irregular Satellites are consistent with other Irregular Satellites populations as found in previous works and reinforcing the observation that the lack of ultra red objects among the irregular satellites is a real feature that separates them from the trans-Neptunian objects (their posited source population).
△ Less
Submitted 18 April, 2022;
originally announced April 2022.
-
The Project 8 Neutrino Mass Experiment
Authors:
Project 8 Collaboration,
A. Ashtari Esfahani,
S. Böser,
N. Buzinsky,
M. C. Carmona-Benitez,
C. Claessens,
L. de Viveiros,
P. J. Doe,
S. Enomoto,
M. Fertl,
J. A. Formaggio,
J. K. Gaison,
M. Grando,
K. M. Heeger,
X. Huyan,
A. M. Jones,
K. Kazkaz,
M. Li,
A. Lindman,
C. Matthé,
R. Mohiuddin,
B. Monreal,
R. Mueller,
J. A. Nikkel,
E. Novitski
, et al. (23 additional authors not shown)
Abstract:
Measurements of the $β^-$ spectrum of tritium give the most precise direct limits on neutrino mass. Project 8 will investigate neutrino mass using Cyclotron Radiation Emission Spectroscopy (CRES) with an atomic tritium source. CRES is a new experimental technique that has the potential to surmount the systematic and statistical limitations of current-generation direct measurement methods. Atomic t…
▽ More
Measurements of the $β^-$ spectrum of tritium give the most precise direct limits on neutrino mass. Project 8 will investigate neutrino mass using Cyclotron Radiation Emission Spectroscopy (CRES) with an atomic tritium source. CRES is a new experimental technique that has the potential to surmount the systematic and statistical limitations of current-generation direct measurement methods. Atomic tritium avoids an irreducible systematic uncertainty associated with the final states populated by the decay of molecular tritium. Project 8 will proceed in a phased approach toward a goal of 40 meV/c$^2$ neutrino-mass sensitivity.
△ Less
Submitted 14 March, 2022;
originally announced March 2022.
-
Counterfactual Analysis of the Impact of the IMF Program on Child Poverty in the Global-South Region using Causal-Graphical Normalizing Flows
Authors:
Sourabh Balgi,
Jose M. Peña,
Adel Daoud
Abstract:
This work demonstrates the application of a particular branch of causal inference and deep learning models: \emph{causal-Graphical Normalizing Flows (c-GNFs)}. In a recent contribution, scholars showed that normalizing flows carry certain properties, making them particularly suitable for causal and counterfactual analysis. However, c-GNFs have only been tested in a simulated data setting and no co…
▽ More
This work demonstrates the application of a particular branch of causal inference and deep learning models: \emph{causal-Graphical Normalizing Flows (c-GNFs)}. In a recent contribution, scholars showed that normalizing flows carry certain properties, making them particularly suitable for causal and counterfactual analysis. However, c-GNFs have only been tested in a simulated data setting and no contribution to date have evaluated the application of c-GNFs on large-scale real-world data. Focusing on the \emph{AI for social good}, our study provides a counterfactual analysis of the impact of the International Monetary Fund (IMF) program on child poverty using c-GNFs. The analysis relies on a large-scale real-world observational data: 1,941,734 children under the age of 18, cared for by 567,344 families residing in the 67 countries from the Global-South. While the primary objective of the IMF is to support governments in achieving economic stability, our results find that an IMF program reduces child poverty as a positive side-effect by about 1.2$\pm$0.24 degree (`0' equals no poverty and `7' is maximum poverty). Thus, our article shows how c-GNFs further the use of deep learning and causal inference in AI for social good. It shows how learning algorithms can be used for addressing the untapped potential for a significant social impact through counterfactual inference at population level (ACE), sub-population level (CACE), and individual level (ICE). In contrast to most works that model ACE or CACE but not ICE, c-GNFs enable personalization using \emph{`The First Law of Causal Inference'}.
△ Less
Submitted 17 February, 2022;
originally announced February 2022.
-
Personalized Public Policy Analysis in Social Sciences using Causal-Graphical Normalizing Flows
Authors:
Sourabh Balgi,
Jose M. Pena,
Adel Daoud
Abstract:
Structural Equation/Causal Models (SEMs/SCMs) are widely used in epidemiology and social sciences to identify and analyze the average causal effect (ACE) and conditional ACE (CACE). Traditional causal effect estimation methods such as Inverse Probability Weighting (IPW) and more recently Regression-With-Residuals (RWR) are widely used - as they avoid the challenging task of identifying the SCM par…
▽ More
Structural Equation/Causal Models (SEMs/SCMs) are widely used in epidemiology and social sciences to identify and analyze the average causal effect (ACE) and conditional ACE (CACE). Traditional causal effect estimation methods such as Inverse Probability Weighting (IPW) and more recently Regression-With-Residuals (RWR) are widely used - as they avoid the challenging task of identifying the SCM parameters - to estimate ACE and CACE. However, much work remains before traditional estimation methods can be used for counterfactual inference, and for the benefit of Personalized Public Policy Analysis (P$^3$A) in the social sciences. While doctors rely on personalized medicine to tailor treatments to patients in laboratory settings (relatively closed systems), P$^3$A draws inspiration from such tailoring but adapts it for open social systems. In this article, we develop a method for counterfactual inference that we name causal-Graphical Normalizing Flow (c-GNF), facilitating P$^3$A. First, we show how c-GNF captures the underlying SCM without making any assumption about functional forms. Second, we propose a novel dequantization trick to deal with discrete variables, which is a limitation of normalizing flows in general. Third, we demonstrate in experiments that c-GNF performs on-par with IPW and RWR in terms of bias and variance for estimating the ATE, when the true functional forms are known, and better when they are unknown. Fourth and most importantly, we conduct counterfactual inference with c-GNFs, demonstrating promising empirical performance. Because IPW and RWR, like other traditional methods, lack the capability of counterfactual inference, c-GNFs will likely play a major role in tailoring personalized treatment, facilitating P$^3$A, optimizing social interventions - in contrast to the current `one-size-fits-all' approach of existing methods.
△ Less
Submitted 30 April, 2022; v1 submitted 7 February, 2022;
originally announced February 2022.
-
Portfolio Choice with Indivisible and Illiquid Housing Assets: The Case of Spain
Authors:
Sergio Mayordomo,
María Rodriguez-Moreno,
Juan Ignacio Peña
Abstract:
This paper studies the investment decision of the Spanish households using a unique data set, the Spanish Survey of Household Finance (EFF). We propose a theoretical model in which households, given a fixed investment in housing, allocate their net wealth across bank time deposits, stocks, and mortgage. Besides considering housing as an indivisible and illiquid asset that restricts the portfolio c…
▽ More
This paper studies the investment decision of the Spanish households using a unique data set, the Spanish Survey of Household Finance (EFF). We propose a theoretical model in which households, given a fixed investment in housing, allocate their net wealth across bank time deposits, stocks, and mortgage. Besides considering housing as an indivisible and illiquid asset that restricts the portfolio choice decision, we take into account the financial constraints that households face when they apply for external funding. For every representative household in the EFF we solve this theoretical problem and obtain the theoretically optimal portfolio that is compared with households' actual choices. We find that households significantly underinvest in stocks and deposits while the optimal and actual mortgage investments are alike. Considering the three types of financial assets at once, we find that the households headed by highly financially sophisticated, older, retired, richer, and unconstrained persons are the ones investing more efficiently.
△ Less
Submitted 4 February, 2022;
originally announced February 2022.