-
Interpretability Indices and Soft Constraints for Factor Models
Authors:
Justin Philip Tuazon,
Gia Mizrane Abubo,
Joemari Olea
Abstract:
Factor analysis is a way to characterize the relationships between many (observable) variables in terms of a smaller number of unobservable random variables which are called factors. However, the application of factor models and its success can be subjective or difficult to gauge, since infinitely many factor models that produce the same correlation matrix can be fit given sample data. Thus, there…
▽ More
Factor analysis is a way to characterize the relationships between many (observable) variables in terms of a smaller number of unobservable random variables which are called factors. However, the application of factor models and its success can be subjective or difficult to gauge, since infinitely many factor models that produce the same correlation matrix can be fit given sample data. Thus, there is a need to operationalize a criterion that measures how meaningful or "interpretable" a factor model is in order to select the best among many factor models. While there are already techniques that aim to measure and enhance interpretability, new indices, as well as rotation methods via mathematical optimization based on them, are proposed to measure interpretability. The proposed methods directly incorporate semantics with the help of natural language processing and are generalized to incorporate any "prior information". Moreover, the indices allow for complete or partial specification of relationships at a pairwise level. Aside from these, two other main benefits of the proposed methods are that they do not require the estimation of factor scores, which avoids the factor score indeterminacy problem, and that no additional explanatory variables are necessary. The implementation of the proposed methods is written in Python 3 and is made available together with several helper functions through the package interpretablefa on the Python Package Index. The methods' application is demonstrated here using data on the Experiences in Close Relationships Scale, obtained from the Open-Source Psychometrics Project.
△ Less
Submitted 22 September, 2024; v1 submitted 17 September, 2024;
originally announced September 2024.
-
Robust Bayes Treatment Choice with Partial Identification
Authors:
Andrés Aradillas Fernández,
José Luis Montiel Olea,
Chen Qiu,
Jörg Stoye,
Serdil Tinda
Abstract:
We study a class of binary treatment choice problems with partial identification, through the lens of robust (multiple prior) Bayesian analysis. We use a convenient set of prior distributions to derive ex-ante and ex-post robust Bayes decision rules, both for decision makers who can randomize and for decision makers who cannot.
Our main messages are as follows: First, ex-ante and ex-post robust…
▽ More
We study a class of binary treatment choice problems with partial identification, through the lens of robust (multiple prior) Bayesian analysis. We use a convenient set of prior distributions to derive ex-ante and ex-post robust Bayes decision rules, both for decision makers who can randomize and for decision makers who cannot.
Our main messages are as follows: First, ex-ante and ex-post robust Bayes decision rules do not tend to agree in general, whether or not randomized rules are allowed. Second, randomized treatment assignment for some data realizations can be optimal in both ex-ante and, perhaps more surprisingly, ex-post problems. Therefore, it is usually with loss of generality to exclude randomized rules from consideration, even when regret is evaluated ex-post.
We apply our results to a stylized problem where a policy maker uses experimental data to choose whether to implement a new policy in a population of interest, but is concerned about the external validity of the experiment at hand (Stoye, 2012); and to the aggregation of data generated by multiple randomized control trials in different sites to make a policy choice in a population for which no experimental data are available (Manski, 2020; Ishihara and Kitagawa, 2021).
△ Less
Submitted 21 August, 2024;
originally announced August 2024.
-
Externally Valid Selection of Experimental Sites via the k-Median Problem
Authors:
José Luis Montiel Olea,
Brenda Prallon,
Chen Qiu,
Jörg Stoye,
Yiwei Sun
Abstract:
We present a decision-theoretic justification for viewing the question of how to best choose where to experiment in order to optimize external validity as a k-median (clustering) problem, a popular problem in computer science and operations research. We present conditions under which minimizing the worst-case, welfare-based regret among all nonrandom schemes that select k sites to experiment is ap…
▽ More
We present a decision-theoretic justification for viewing the question of how to best choose where to experiment in order to optimize external validity as a k-median (clustering) problem, a popular problem in computer science and operations research. We present conditions under which minimizing the worst-case, welfare-based regret among all nonrandom schemes that select k sites to experiment is approximately equal - and sometimes exactly equal - to finding the k most central vectors of baseline site-level covariates. The k-median problem can be formulated as a linear integer program. Two empirical applications illustrate the theoretical and computational benefits of the suggested procedure.
△ Less
Submitted 17 August, 2024;
originally announced August 2024.
-
Double Robustness of Local Projections and Some Unpleasant VARithmetic
Authors:
José Luis Montiel Olea,
Mikkel Plagborg-Møller,
Eric Qian,
Christian K. Wolf
Abstract:
We consider impulse response inference in a locally misspecified vector autoregression (VAR) model. The conventional local projection (LP) confidence interval has correct coverage even when the misspecification is so large that it can be detected with probability approaching 1. This result follows from a "double robustness" property analogous to that of popular partially linear regression estimato…
▽ More
We consider impulse response inference in a locally misspecified vector autoregression (VAR) model. The conventional local projection (LP) confidence interval has correct coverage even when the misspecification is so large that it can be detected with probability approaching 1. This result follows from a "double robustness" property analogous to that of popular partially linear regression estimators. In contrast, the conventional VAR confidence interval with short-to-moderate lag length can severely undercover, even for misspecification that is small, economically plausible, and difficult to detect statistically. There is no free lunch: the VAR confidence interval has robust coverage only if the lag length is so large that the interval is as wide as the LP interval.
△ Less
Submitted 5 August, 2024; v1 submitted 15 May, 2024;
originally announced May 2024.
-
Decision Theory for Treatment Choice Problems with Partial Identification
Authors:
José Luis Montiel Olea,
Chen Qiu,
Jörg Stoye
Abstract:
We apply classical statistical decision theory to a large class of treatment choice problems with partial identification, revealing important theoretical and practical challenges but also interesting research opportunities. The challenges are: In a general class of problems with Gaussian likelihood, all decision rules are admissible; it is maximin-welfare optimal to ignore all data; and, for sever…
▽ More
We apply classical statistical decision theory to a large class of treatment choice problems with partial identification, revealing important theoretical and practical challenges but also interesting research opportunities. The challenges are: In a general class of problems with Gaussian likelihood, all decision rules are admissible; it is maximin-welfare optimal to ignore all data; and, for severe enough partial identification, there are infinitely many minimax-regret optimal decision rules, all of which sometimes randomize the policy recommendation. The opportunities are: We introduce a profiled regret criterion that can reveal important differences between rules and render some of them inadmissible; and we uniquely characterize the minimax-regret optimal rule that least frequently randomizes. We apply our results to aggregation of experimental estimates for policy adoption, to extrapolation of Local Average Treatment Effects, and to policy making in the presence of omitted variable bias.
△ Less
Submitted 7 August, 2024; v1 submitted 29 December, 2023;
originally announced December 2023.
-
The out-of-sample prediction error of the square-root-LASSO and related estimators
Authors:
José Luis Montiel Olea,
Cynthia Rush,
Amilcar Velez,
Johannes Wiesel
Abstract:
We study the classical problem of predicting an outcome variable, $Y$, using a linear combination of a $d$-dimensional covariate vector, $\mathbf{X}$. We are interested in linear predictors whose coefficients solve: % \begin{align*} \inf_{\boldsymbolβ \in \mathbb{R}^d} \left( \mathbb{E}_{\mathbb{P}_n} \left[ \left(Y-\mathbf{X}^{\top}β\right)^r \right] \right)^{1/r} +δ\, ρ\left(\boldsymbolβ\right),…
▽ More
We study the classical problem of predicting an outcome variable, $Y$, using a linear combination of a $d$-dimensional covariate vector, $\mathbf{X}$. We are interested in linear predictors whose coefficients solve: % \begin{align*} \inf_{\boldsymbolβ \in \mathbb{R}^d} \left( \mathbb{E}_{\mathbb{P}_n} \left[ \left(Y-\mathbf{X}^{\top}β\right)^r \right] \right)^{1/r} +δ\, ρ\left(\boldsymbolβ\right), \end{align*} where $δ>0$ is a regularization parameter, $ρ:\mathbb{R}^d\to \mathbb{R}_+$ is a convex penalty function, $\mathbb{P}_n$ is the empirical distribution of the data, and $r\geq 1$. We present three sets of new results. First, we provide conditions under which linear predictors based on these estimators % solve a \emph{distributionally robust optimization} problem: they minimize the worst-case prediction error over distributions that are close to each other in a type of \emph{max-sliced Wasserstein metric}. Second, we provide a detailed finite-sample and asymptotic analysis of the statistical properties of the balls of distributions over which the worst-case prediction error is analyzed. Third, we use the distributionally robust optimality and our statistical analysis to present i) an oracle recommendation for the choice of regularization parameter, $δ$, that guarantees good out-of-sample prediction error; and ii) a test-statistic to rank the out-of-sample performance of two different linear estimators. None of our results rely on sparsity assumptions about the true data generating process; thus, they broaden the scope of use of the square-root lasso and related estimators in prediction problems.
△ Less
Submitted 8 April, 2024; v1 submitted 14 November, 2022;
originally announced November 2022.
-
On the Robustness to Misspecification of $α$-Posteriors and Their Variational Approximations
Authors:
Marco Avella Medina,
José Luis Montiel Olea,
Cynthia Rush,
Amilcar Velez
Abstract:
$α…
▽ More
$α$-posteriors and their variational approximations distort standard posterior inference by downweighting the likelihood and introducing variational approximation errors. We show that such distortions, if tuned appropriately, reduce the Kullback-Leibler (KL) divergence from the true, but perhaps infeasible, posterior distribution when there is potential parametric model misspecification. To make this point, we derive a Bernstein-von Mises theorem showing convergence in total variation distance of $α$-posteriors and their variational approximations to limiting Gaussian distributions. We use these distributions to evaluate the KL divergence between true and reported posteriors. We show this divergence is minimized by choosing $α$ strictly smaller than one, assuming there is a vanishingly small probability of model misspecification. The optimized value becomes smaller as the the misspecification becomes more severe. The optimized KL divergence increases logarithmically in the degree of misspecification and not linearly as with the usual posterior.
△ Less
Submitted 16 April, 2021;
originally announced April 2021.
-
Machine Learning's Dropout Training is Distributionally Robust Optimal
Authors:
Jose Blanchet,
Yang Kang,
Jose Luis Montiel Olea,
Viet Anh Nguyen,
Xuhui Zhang
Abstract:
This paper shows that dropout training in Generalized Linear Models is the minimax solution of a two-player, zero-sum game where an adversarial nature corrupts a statistician's covariates using a multiplicative nonparametric errors-in-variables model. In this game, nature's least favorable distribution is dropout noise, where nature independently deletes entries of the covariate vector with some f…
▽ More
This paper shows that dropout training in Generalized Linear Models is the minimax solution of a two-player, zero-sum game where an adversarial nature corrupts a statistician's covariates using a multiplicative nonparametric errors-in-variables model. In this game, nature's least favorable distribution is dropout noise, where nature independently deletes entries of the covariate vector with some fixed probability $δ$. This result implies that dropout training indeed provides out-of-sample expected loss guarantees for distributions that arise from multiplicative perturbations of in-sample data. In addition to the decision-theoretic analysis, the paper makes two more contributions. First, there is a concrete recommendation on how to select the tuning parameter $δ$ to guarantee that, as the sample size grows large, the in-sample loss after dropout training exceeds the true population loss with some pre-specified probability. Second, the paper provides a novel, parallelizable, Unbiased Multi-Level Monte Carlo algorithm to speed-up the implementation of dropout training. Our algorithm has a much smaller computational cost compared to the naive implementation of dropout, provided the number of data points is much smaller than the dimension of the covariate vector.
△ Less
Submitted 14 April, 2021; v1 submitted 13 September, 2020;
originally announced September 2020.
-
Local Projection Inference is Simpler and More Robust Than You Think
Authors:
José Luis Montiel Olea,
Mikkel Plagborg-Møller
Abstract:
Applied macroeconomists often compute confidence intervals for impulse responses using local projections, i.e., direct linear regressions of future outcomes on current covariates. This paper proves that local projection inference robustly handles two issues that commonly arise in applications: highly persistent data and the estimation of impulse responses at long horizons. We consider local projec…
▽ More
Applied macroeconomists often compute confidence intervals for impulse responses using local projections, i.e., direct linear regressions of future outcomes on current covariates. This paper proves that local projection inference robustly handles two issues that commonly arise in applications: highly persistent data and the estimation of impulse responses at long horizons. We consider local projections that control for lags of the variables in the regression. We show that lag-augmented local projections with normal critical values are asymptotically valid uniformly over (i) both stationary and non-stationary data, and also over (ii) a wide range of response horizons. Moreover, lag augmentation obviates the need to correct standard errors for serial correlation in the regression residuals. Hence, local projection inference is arguably both simpler than previously thought and more robust than standard autoregressive inference, whose validity is known to depend sensitively on the persistence of the data and on the length of the horizon.
△ Less
Submitted 21 December, 2022; v1 submitted 27 July, 2020;
originally announced July 2020.
-
High quality Al$_{0.37}$In$_{0.63}$N layers grown at low temperature (<300$^\circ$C) by radio-frequency sputtering
Authors:
A Núñez-Cascajero,
R. Blasco,
S Valdueza-Felip,
D. Montero,
J. Olea,
F. B. Naranjo
Abstract:
High-quality Al0.37In0.63N layers have been grown by reactive radio-frequency (RF) sputtering on sapphire, glass and Si (111) at low substrate temperature (from room temperature to 300°C). Their structural, chemical and optical properties are investigated as a function of the growth temperature and type of substrate. X-ray diffraction measurements reveal that all samples have a wurtzite crystallog…
▽ More
High-quality Al0.37In0.63N layers have been grown by reactive radio-frequency (RF) sputtering on sapphire, glass and Si (111) at low substrate temperature (from room temperature to 300°C). Their structural, chemical and optical properties are investigated as a function of the growth temperature and type of substrate. X-ray diffraction measurements reveal that all samples have a wurtzite crystallographic structure oriented with the c-axis perpendicular to the substrate surface, without parasitic orientations. The layers preserve their Al content at 37 % for the whole range of studied growth temperature. The samples grown at low temperatures (RT and 100°C) are almost fully relaxed, showing a closely-packed columnar-like morphology with an RMS surface roughness below 3 nm. The optical band gap energy estimated for layers grown at RT and 100°C on sapphire and glass substrates is of ~2.4 eV while it red shifts to ~2.03 eV at 300°C. The feasibility of growing high crystalline quality AlInN at low growth temperature even on amorphous substrates open new application fields for this material like surface plasmon resonance sensors developed directly on optical fibers and other applications where temperature is a handicap and the material cannot be heated.
△ Less
Submitted 26 September, 2019;
originally announced September 2019.
-
Competing Models
Authors:
Jose Luis Montiel Olea,
Pietro Ortoleva,
Mallesh M Pai,
Andrea Prat
Abstract:
Different agents need to make a prediction. They observe identical data, but have different models: they predict using different explanatory variables. We study which agent believes they have the best predictive ability -- as measured by the smallest subjective posterior mean squared prediction error -- and show how it depends on the sample size. With small samples, we present results suggesting i…
▽ More
Different agents need to make a prediction. They observe identical data, but have different models: they predict using different explanatory variables. We study which agent believes they have the best predictive ability -- as measured by the smallest subjective posterior mean squared prediction error -- and show how it depends on the sample size. With small samples, we present results suggesting it is an agent using a low-dimensional model. With large samples, it is generally an agent with a high-dimensional model, possibly including irrelevant variables, but never excluding relevant ones. We apply our results to characterize the winning model in an auction of productive assets, to argue that entrepreneurs and investors with simple models will be over-represented in new sectors, and to understand the proliferation of "factors" that explain the cross-sectional variation of expected stock returns in the asset-pricing literature.
△ Less
Submitted 11 November, 2021; v1 submitted 8 July, 2019;
originally announced July 2019.
-
Influence of the AlN interlayer thickness on the photovoltaic properties of In-rich AlInN on Si heterojunctions deposited by RF sputtering
Authors:
S. Valdueza-Felip,
A. Núñez-Cascajero,
R. Blasco,
D. Montero,
L. Grenet,
M. de la Mata,
S. Fernández,
L. Rodríguez-De Marcos,
S. I. Molina,
J. Olea,
F. B. Naranjo
Abstract:
We report the influence of the AlN interlayer thickness (0-15 nm) on the photovoltaic properties of Al0.37In0.63N on Si heterojunction solar cells deposited by radio frequency sputtering. The poor junction band alignment and the presence of a 2-3 nm thick amorphous layer at the interface mitigates the response in devices fabricated by direct deposition of n-AlInN on p-Si(111). Adding a 4-nm-thick…
▽ More
We report the influence of the AlN interlayer thickness (0-15 nm) on the photovoltaic properties of Al0.37In0.63N on Si heterojunction solar cells deposited by radio frequency sputtering. The poor junction band alignment and the presence of a 2-3 nm thick amorphous layer at the interface mitigates the response in devices fabricated by direct deposition of n-AlInN on p-Si(111). Adding a 4-nm-thick AlN buffer layer improves the AlInN crystalline quality and the interface alignment leading to devices with a conversion efficiency of 1.5% under 1-sun AM1.5G illumination. For thicker buffers the performance lessens due to inefficient tunnel transport through the AlN. These results demonstrate the feasibility of using In-rich AlInN alloys deposited by radio frequency sputtering as novel electron-selective contacts to Si-heterojunction solar cells.
△ Less
Submitted 3 August, 2018;
originally announced August 2018.