subscribe to arXiv mailings

Interpretability Indices and Soft Constraints for Factor Models

Authors: Justin Philip Tuazon, Gia Mizrane Abubo, Joemari Olea

Abstract: Factor analysis is a way to characterize the relationships between many (observable) variables in terms of a smaller number of unobservable random variables which are called factors. However, the application of factor models and its success can be subjective or difficult to gauge, since infinitely many factor models that produce the same correlation matrix can be fit given sample data. Thus, there… ▽ More Factor analysis is a way to characterize the relationships between many (observable) variables in terms of a smaller number of unobservable random variables which are called factors. However, the application of factor models and its success can be subjective or difficult to gauge, since infinitely many factor models that produce the same correlation matrix can be fit given sample data. Thus, there is a need to operationalize a criterion that measures how meaningful or "interpretable" a factor model is in order to select the best among many factor models. While there are already techniques that aim to measure and enhance interpretability, new indices, as well as rotation methods via mathematical optimization based on them, are proposed to measure interpretability. The proposed methods directly incorporate semantics with the help of natural language processing and are generalized to incorporate any "prior information". Moreover, the indices allow for complete or partial specification of relationships at a pairwise level. Aside from these, two other main benefits of the proposed methods are that they do not require the estimation of factor scores, which avoids the factor score indeterminacy problem, and that no additional explanatory variables are necessary. The implementation of the proposed methods is written in Python 3 and is made available together with several helper functions through the package interpretablefa on the Python Package Index. The methods' application is demonstrated here using data on the Experiences in Close Relationships Scale, obtained from the Open-Source Psychometrics Project. △ Less

Submitted 22 September, 2024; v1 submitted 17 September, 2024; originally announced September 2024.

arXiv:2408.11621 [pdf, other]

Robust Bayes Treatment Choice with Partial Identification

Authors: Andrés Aradillas Fernández, José Luis Montiel Olea, Chen Qiu, Jörg Stoye, Serdil Tinda

Abstract: We study a class of binary treatment choice problems with partial identification, through the lens of robust (multiple prior) Bayesian analysis. We use a convenient set of prior distributions to derive ex-ante and ex-post robust Bayes decision rules, both for decision makers who can randomize and for decision makers who cannot. Our main messages are as follows: First, ex-ante and ex-post robust… ▽ More We study a class of binary treatment choice problems with partial identification, through the lens of robust (multiple prior) Bayesian analysis. We use a convenient set of prior distributions to derive ex-ante and ex-post robust Bayes decision rules, both for decision makers who can randomize and for decision makers who cannot. Our main messages are as follows: First, ex-ante and ex-post robust Bayes decision rules do not tend to agree in general, whether or not randomized rules are allowed. Second, randomized treatment assignment for some data realizations can be optimal in both ex-ante and, perhaps more surprisingly, ex-post problems. Therefore, it is usually with loss of generality to exclude randomized rules from consideration, even when regret is evaluated ex-post. We apply our results to a stylized problem where a policy maker uses experimental data to choose whether to implement a new policy in a population of interest, but is concerned about the external validity of the experiment at hand (Stoye, 2012); and to the aggregation of data generated by multiple randomized control trials in different sites to make a policy choice in a population for which no experimental data are available (Manski, 2020; Ishihara and Kitagawa, 2021). △ Less

Submitted 21 August, 2024; originally announced August 2024.

arXiv:2408.09187 [pdf, other]

Externally Valid Selection of Experimental Sites via the k-Median Problem

Authors: José Luis Montiel Olea, Brenda Prallon, Chen Qiu, Jörg Stoye, Yiwei Sun

Abstract: We present a decision-theoretic justification for viewing the question of how to best choose where to experiment in order to optimize external validity as a k-median (clustering) problem, a popular problem in computer science and operations research. We present conditions under which minimizing the worst-case, welfare-based regret among all nonrandom schemes that select k sites to experiment is ap… ▽ More We present a decision-theoretic justification for viewing the question of how to best choose where to experiment in order to optimize external validity as a k-median (clustering) problem, a popular problem in computer science and operations research. We present conditions under which minimizing the worst-case, welfare-based regret among all nonrandom schemes that select k sites to experiment is approximately equal - and sometimes exactly equal - to finding the k most central vectors of baseline site-level covariates. The k-median problem can be formulated as a linear integer program. Two empirical applications illustrate the theoretical and computational benefits of the suggested procedure. △ Less

Submitted 17 August, 2024; originally announced August 2024.

arXiv:2405.09509 [pdf, other]

Double Robustness of Local Projections and Some Unpleasant VARithmetic

Authors: José Luis Montiel Olea, Mikkel Plagborg-Møller, Eric Qian, Christian K. Wolf

Abstract: We consider impulse response inference in a locally misspecified vector autoregression (VAR) model. The conventional local projection (LP) confidence interval has correct coverage even when the misspecification is so large that it can be detected with probability approaching 1. This result follows from a "double robustness" property analogous to that of popular partially linear regression estimato… ▽ More We consider impulse response inference in a locally misspecified vector autoregression (VAR) model. The conventional local projection (LP) confidence interval has correct coverage even when the misspecification is so large that it can be detected with probability approaching 1. This result follows from a "double robustness" property analogous to that of popular partially linear regression estimators. In contrast, the conventional VAR confidence interval with short-to-moderate lag length can severely undercover, even for misspecification that is small, economically plausible, and difficult to detect statistically. There is no free lunch: the VAR confidence interval has robust coverage only if the lag length is so large that the interval is as wide as the LP interval. △ Less

Submitted 5 August, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

arXiv:2312.17623 [pdf, other]

Decision Theory for Treatment Choice Problems with Partial Identification

Authors: José Luis Montiel Olea, Chen Qiu, Jörg Stoye

Abstract: We apply classical statistical decision theory to a large class of treatment choice problems with partial identification, revealing important theoretical and practical challenges but also interesting research opportunities. The challenges are: In a general class of problems with Gaussian likelihood, all decision rules are admissible; it is maximin-welfare optimal to ignore all data; and, for sever… ▽ More We apply classical statistical decision theory to a large class of treatment choice problems with partial identification, revealing important theoretical and practical challenges but also interesting research opportunities. The challenges are: In a general class of problems with Gaussian likelihood, all decision rules are admissible; it is maximin-welfare optimal to ignore all data; and, for severe enough partial identification, there are infinitely many minimax-regret optimal decision rules, all of which sometimes randomize the policy recommendation. The opportunities are: We introduce a profiled regret criterion that can reveal important differences between rules and render some of them inadmissible; and we uniquely characterize the minimax-regret optimal rule that least frequently randomizes. We apply our results to aggregation of experimental estimates for policy adoption, to extrapolation of Local Average Treatment Effects, and to policy making in the presence of omitted variable bias. △ Less

Submitted 7 August, 2024; v1 submitted 29 December, 2023; originally announced December 2023.

arXiv:2211.07608 [pdf, ps, other]

The out-of-sample prediction error of the square-root-LASSO and related estimators

Authors: José Luis Montiel Olea, Cynthia Rush, Amilcar Velez, Johannes Wiesel

Abstract: We study the classical problem of predicting an outcome variable, $Y$, using a linear combination of a $d$-dimensional covariate vector, $\mathbf{X}$. We are interested in linear predictors whose coefficients solve: % \begin{align*} \inf_{\boldsymbolβ \in \mathbb{R}^d} \left( \mathbb{E}_{\mathbb{P}_n} \left[ \left(Y-\mathbf{X}^{\top}β\right)^r \right] \right)^{1/r} +δ\, ρ\left(\boldsymbolβ\right),… ▽ More We study the classical problem of predicting an outcome variable, $Y$, using a linear combination of a $d$-dimensional covariate vector, $\mathbf{X}$. We are interested in linear predictors whose coefficients solve: % \begin{align*} \inf_{\boldsymbolβ \in \mathbb{R}^d} \left( \mathbb{E}_{\mathbb{P}_n} \left[ \left(Y-\mathbf{X}^{\top}β\right)^r \right] \right)^{1/r} +δ\, ρ\left(\boldsymbolβ\right), \end{align*} where $δ>0$ is a regularization parameter, $ρ:\mathbb{R}^d\to \mathbb{R}_+$ is a convex penalty function, $\mathbb{P}_n$ is the empirical distribution of the data, and $r\geq 1$. We present three sets of new results. First, we provide conditions under which linear predictors based on these estimators % solve a \emph{distributionally robust optimization} problem: they minimize the worst-case prediction error over distributions that are close to each other in a type of \emph{max-sliced Wasserstein metric}. Second, we provide a detailed finite-sample and asymptotic analysis of the statistical properties of the balls of distributions over which the worst-case prediction error is analyzed. Third, we use the distributionally robust optimality and our statistical analysis to present i) an oracle recommendation for the choice of regularization parameter, $δ$, that guarantees good out-of-sample prediction error; and ii) a test-statistic to rank the out-of-sample performance of two different linear estimators. None of our results rely on sparsity assumptions about the true data generating process; thus, they broaden the scope of use of the square-root lasso and related estimators in prediction problems. △ Less

Submitted 8 April, 2024; v1 submitted 14 November, 2022; originally announced November 2022.

arXiv:2104.08324 [pdf, ps, other]

On the Robustness to Misspecification of $α$-Posteriors and Their Variational Approximations

Authors: Marco Avella Medina, José Luis Montiel Olea, Cynthia Rush, Amilcar Velez

Abstract: $α… ▽ More $α$-posteriors and their variational approximations distort standard posterior inference by downweighting the likelihood and introducing variational approximation errors. We show that such distortions, if tuned appropriately, reduce the Kullback-Leibler (KL) divergence from the true, but perhaps infeasible, posterior distribution when there is potential parametric model misspecification. To make this point, we derive a Bernstein-von Mises theorem showing convergence in total variation distance of $α$-posteriors and their variational approximations to limiting Gaussian distributions. We use these distributions to evaluate the KL divergence between true and reported posteriors. We show this divergence is minimized by choosing $α$ strictly smaller than one, assuming there is a vanishingly small probability of model misspecification. The optimized value becomes smaller as the the misspecification becomes more severe. The optimized KL divergence increases logarithmically in the degree of misspecification and not linearly as with the usual posterior. △ Less

Submitted 16 April, 2021; originally announced April 2021.

arXiv:2009.06111 [pdf, other]

Machine Learning's Dropout Training is Distributionally Robust Optimal

Authors: Jose Blanchet, Yang Kang, Jose Luis Montiel Olea, Viet Anh Nguyen, Xuhui Zhang

Abstract: This paper shows that dropout training in Generalized Linear Models is the minimax solution of a two-player, zero-sum game where an adversarial nature corrupts a statistician's covariates using a multiplicative nonparametric errors-in-variables model. In this game, nature's least favorable distribution is dropout noise, where nature independently deletes entries of the covariate vector with some f… ▽ More This paper shows that dropout training in Generalized Linear Models is the minimax solution of a two-player, zero-sum game where an adversarial nature corrupts a statistician's covariates using a multiplicative nonparametric errors-in-variables model. In this game, nature's least favorable distribution is dropout noise, where nature independently deletes entries of the covariate vector with some fixed probability $δ$. This result implies that dropout training indeed provides out-of-sample expected loss guarantees for distributions that arise from multiplicative perturbations of in-sample data. In addition to the decision-theoretic analysis, the paper makes two more contributions. First, there is a concrete recommendation on how to select the tuning parameter $δ$ to guarantee that, as the sample size grows large, the in-sample loss after dropout training exceeds the true population loss with some pre-specified probability. Second, the paper provides a novel, parallelizable, Unbiased Multi-Level Monte Carlo algorithm to speed-up the implementation of dropout training. Our algorithm has a much smaller computational cost compared to the naive implementation of dropout, provided the number of data points is much smaller than the dimension of the covariate vector. △ Less

Submitted 14 April, 2021; v1 submitted 13 September, 2020; originally announced September 2020.

arXiv:2007.13888 [pdf, ps, other]

doi 10.3982/ECTA18756

Local Projection Inference is Simpler and More Robust Than You Think

Authors: José Luis Montiel Olea, Mikkel Plagborg-Møller

Abstract: Applied macroeconomists often compute confidence intervals for impulse responses using local projections, i.e., direct linear regressions of future outcomes on current covariates. This paper proves that local projection inference robustly handles two issues that commonly arise in applications: highly persistent data and the estimation of impulse responses at long horizons. We consider local projec… ▽ More Applied macroeconomists often compute confidence intervals for impulse responses using local projections, i.e., direct linear regressions of future outcomes on current covariates. This paper proves that local projection inference robustly handles two issues that commonly arise in applications: highly persistent data and the estimation of impulse responses at long horizons. We consider local projections that control for lags of the variables in the regression. We show that lag-augmented local projections with normal critical values are asymptotically valid uniformly over (i) both stationary and non-stationary data, and also over (ii) a wide range of response horizons. Moreover, lag augmentation obviates the need to correct standard errors for serial correlation in the regression residuals. Hence, local projection inference is arguably both simpler than previously thought and more robust than standard autoregressive inference, whose validity is known to depend sensitively on the persistence of the data and on the length of the horizon. △ Less

Submitted 21 December, 2022; v1 submitted 27 July, 2020; originally announced July 2020.

Journal ref: Econometrica, July 2021, Volume 89, Issue 4, Pages 1789-1823

arXiv:1909.12068 [pdf]

doi 10.1016/j.mssp.2019.04.029

High quality Al$_{0.37}$In$_{0.63}$N layers grown at low temperature (<300$^\circ$C) by radio-frequency sputtering

Authors: A Núñez-Cascajero, R. Blasco, S Valdueza-Felip, D. Montero, J. Olea, F. B. Naranjo

Abstract: High-quality Al0.37In0.63N layers have been grown by reactive radio-frequency (RF) sputtering on sapphire, glass and Si (111) at low substrate temperature (from room temperature to 300°C). Their structural, chemical and optical properties are investigated as a function of the growth temperature and type of substrate. X-ray diffraction measurements reveal that all samples have a wurtzite crystallog… ▽ More High-quality Al0.37In0.63N layers have been grown by reactive radio-frequency (RF) sputtering on sapphire, glass and Si (111) at low substrate temperature (from room temperature to 300°C). Their structural, chemical and optical properties are investigated as a function of the growth temperature and type of substrate. X-ray diffraction measurements reveal that all samples have a wurtzite crystallographic structure oriented with the c-axis perpendicular to the substrate surface, without parasitic orientations. The layers preserve their Al content at 37 % for the whole range of studied growth temperature. The samples grown at low temperatures (RT and 100°C) are almost fully relaxed, showing a closely-packed columnar-like morphology with an RMS surface roughness below 3 nm. The optical band gap energy estimated for layers grown at RT and 100°C on sapphire and glass substrates is of ~2.4 eV while it red shifts to ~2.03 eV at 300°C. The feasibility of growing high crystalline quality AlInN at low growth temperature even on amorphous substrates open new application fields for this material like surface plasmon resonance sensors developed directly on optical fibers and other applications where temperature is a handicap and the material cannot be heated. △ Less

Submitted 26 September, 2019; originally announced September 2019.

Journal ref: Materials Science in Semiconductor Processing, Volume 100, September 2019, Pages 8-14

arXiv:1907.03809 [pdf, other]

doi 10.1093/qje/qjac015

Competing Models

Authors: Jose Luis Montiel Olea, Pietro Ortoleva, Mallesh M Pai, Andrea Prat

Abstract: Different agents need to make a prediction. They observe identical data, but have different models: they predict using different explanatory variables. We study which agent believes they have the best predictive ability -- as measured by the smallest subjective posterior mean squared prediction error -- and show how it depends on the sample size. With small samples, we present results suggesting i… ▽ More Different agents need to make a prediction. They observe identical data, but have different models: they predict using different explanatory variables. We study which agent believes they have the best predictive ability -- as measured by the smallest subjective posterior mean squared prediction error -- and show how it depends on the sample size. With small samples, we present results suggesting it is an agent using a low-dimensional model. With large samples, it is generally an agent with a high-dimensional model, possibly including irrelevant variables, but never excluding relevant ones. We apply our results to characterize the winning model in an auction of productive assets, to argue that entrepreneurs and investors with simple models will be over-represented in new sectors, and to understand the proliferation of "factors" that explain the cross-sectional variation of expected stock returns in the asset-pricing literature. △ Less

Submitted 11 November, 2021; v1 submitted 8 July, 2019; originally announced July 2019.

MSC Class: 62J99; 91B26;

arXiv:1808.01117 [pdf]

Influence of the AlN interlayer thickness on the photovoltaic properties of In-rich AlInN on Si heterojunctions deposited by RF sputtering

Authors: S. Valdueza-Felip, A. Núñez-Cascajero, R. Blasco, D. Montero, L. Grenet, M. de la Mata, S. Fernández, L. Rodríguez-De Marcos, S. I. Molina, J. Olea, F. B. Naranjo

Abstract: We report the influence of the AlN interlayer thickness (0-15 nm) on the photovoltaic properties of Al0.37In0.63N on Si heterojunction solar cells deposited by radio frequency sputtering. The poor junction band alignment and the presence of a 2-3 nm thick amorphous layer at the interface mitigates the response in devices fabricated by direct deposition of n-AlInN on p-Si(111). Adding a 4-nm-thick… ▽ More We report the influence of the AlN interlayer thickness (0-15 nm) on the photovoltaic properties of Al0.37In0.63N on Si heterojunction solar cells deposited by radio frequency sputtering. The poor junction band alignment and the presence of a 2-3 nm thick amorphous layer at the interface mitigates the response in devices fabricated by direct deposition of n-AlInN on p-Si(111). Adding a 4-nm-thick AlN buffer layer improves the AlInN crystalline quality and the interface alignment leading to devices with a conversion efficiency of 1.5% under 1-sun AM1.5G illumination. For thicker buffers the performance lessens due to inefficient tunnel transport through the AlN. These results demonstrate the feasibility of using In-rich AlInN alloys deposited by radio frequency sputtering as novel electron-selective contacts to Si-heterojunction solar cells. △ Less

Submitted 3 August, 2018; originally announced August 2018.

Showing 1–12 of 12 results for author: Olea, J