subscribe to arXiv mailings

Tuning parameter selection in econometrics

Abstract: I review some of the main methods for selecting tuning parameters in nonparametric and $\ell_1$-penalized estimation. For the nonparametric estimation, I consider the methods of Mallows, Stein, Lepski, cross-validation, penalization, and aggregation in the context of series estimation. For the $\ell_1$-penalized estimation, I consider the methods based on the theory of self-normalized moderate dev… ▽ More I review some of the main methods for selecting tuning parameters in nonparametric and $\ell_1$-penalized estimation. For the nonparametric estimation, I consider the methods of Mallows, Stein, Lepski, cross-validation, penalization, and aggregation in the context of series estimation. For the $\ell_1$-penalized estimation, I consider the methods based on the theory of self-normalized moderate deviations, bootstrap, Stein's unbiased risk estimation, and cross-validation in the context of Lasso estimation. I explain the intuition behind each of the methods and discuss their comparative advantages. I also give some extensions. △ Less

Submitted 5 May, 2024; originally announced May 2024.

Comments: 41 pages, 1 table

MSC Class: 62-02

arXiv:2404.08129 [pdf, other]

One Factor to Bind the Cross-Section of Returns

Authors: Nicola Borri, Denis Chetverikov, Yukun Liu, Aleh Tsyvinski

Abstract: We propose a new non-linear single-factor asset pricing model $r_{it}=h(f_{t}λ_{i})+ε_{it}$. Despite its parsimony, this model represents exactly any non-linear model with an arbitrary number of factors and loadings -- a consequence of the Kolmogorov-Arnold representation theorem. It features only one pricing component $h(f_{t}λ_{I})$, comprising a nonparametric link function of the time-dependent… ▽ More We propose a new non-linear single-factor asset pricing model $r_{it}=h(f_{t}λ_{i})+ε_{it}$. Despite its parsimony, this model represents exactly any non-linear model with an arbitrary number of factors and loadings -- a consequence of the Kolmogorov-Arnold representation theorem. It features only one pricing component $h(f_{t}λ_{I})$, comprising a nonparametric link function of the time-dependent factor and factor loading that we jointly estimate with sieve-based estimators. Using 171 assets across major classes, our model delivers superior cross-sectional performance with a low-dimensional approximation of the link function. Most known finance and macro factors become insignificant controlling for our single-factor. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2401.15205 [pdf, other]

csranks: An R Package for Estimation and Inference Involving Ranks

Authors: Denis Chetverikov, Magne Mogstad, Pawel Morgen, Joseph Romano, Azeem Shaikh, Daniel Wilhelm

Abstract: This article introduces the R package csranks for estimation and inference involving ranks. First, we review methods for the construction of confidence sets for ranks, namely marginal and simultaneous confidence sets as well as confidence sets for the identities of the tau-best. Second, we review methods for estimation and inference in regressions involving ranks. Third, we describe the implementa… ▽ More This article introduces the R package csranks for estimation and inference involving ranks. First, we review methods for the construction of confidence sets for ranks, namely marginal and simultaneous confidence sets as well as confidence sets for the identities of the tau-best. Second, we review methods for estimation and inference in regressions involving ranks. Third, we describe the implementation of these methods in csranks and illustrate their usefulness in two examples: one about the quantification of uncertainty in the PISA ranking of countries and one about the measurement of intergenerational mobility using rank-rank regressions. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2312.10333 [pdf, ps, other]

Logit-based alternatives to two-stage least squares

Authors: Denis Chetverikov, Jinyong Hahn, Zhipeng Liao, Shuyang Sheng

Abstract: We propose logit-based IV and augmented logit-based IV estimators that serve as alternatives to the traditionally used 2SLS estimator in the model where both the endogenous treatment variable and the corresponding instrument are binary. Our novel estimators are as easy to compute as the 2SLS estimator but have an advantage over the 2SLS estimator in terms of causal interpretability. In particular,… ▽ More We propose logit-based IV and augmented logit-based IV estimators that serve as alternatives to the traditionally used 2SLS estimator in the model where both the endogenous treatment variable and the corresponding instrument are binary. Our novel estimators are as easy to compute as the 2SLS estimator but have an advantage over the 2SLS estimator in terms of causal interpretability. In particular, in certain cases where the probability limits of both our estimators and the 2SLS estimator take the form of weighted-average treatment effects, our estimators are guaranteed to yield non-negative weights whereas the 2SLS estimator is not. △ Less

Submitted 16 December, 2023; originally announced December 2023.

Comments: 29 pages

arXiv:2310.15512 [pdf, other]

Inference for Rank-Rank Regressions

Authors: Denis Chetverikov, Daniel Wilhelm

Abstract: The slope coefficient in a rank-rank regression is a popular measure of intergenerational mobility. In this article, we first show that commonly used inference methods for this slope parameter are invalid. Second, when the underlying distribution is not continuous, the OLS estimator and its asymptotic distribution may be highly sensitive to how ties in the ranks are handled. Motivated by these fin… ▽ More The slope coefficient in a rank-rank regression is a popular measure of intergenerational mobility. In this article, we first show that commonly used inference methods for this slope parameter are invalid. Second, when the underlying distribution is not continuous, the OLS estimator and its asymptotic distribution may be highly sensitive to how ties in the ranks are handled. Motivated by these findings we develop a new asymptotic theory for the OLS estimator in a general class of rank-rank regression specifications without imposing any assumptions about the continuity of the underlying distribution. We then extend the asymptotic theory to other regressions involving ranks that have been used in empirical work. Finally, we apply our new inference methods to two empirical studies on intergenerational mobility, highlighting the practical implications of our theoretical findings. △ Less

Submitted 2 July, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

MSC Class: 62J99

arXiv:2303.10306 [pdf, ps, other]

Standard errors when a regressor is randomly assigned

Authors: Denis Chetverikov, Jinyong Hahn, Zhipeng Liao, Andres Santos

Abstract: We examine asymptotic properties of the OLS estimator when the values of the regressor of interest are assigned randomly and independently of other regressors. We find that the OLS variance formula in this case is often simplified, sometimes substantially. In particular, when the regressor of interest is independent not only of other regressors but also of the error term, the textbook homoskedasti… ▽ More We examine asymptotic properties of the OLS estimator when the values of the regressor of interest are assigned randomly and independently of other regressors. We find that the OLS variance formula in this case is often simplified, sometimes substantially. In particular, when the regressor of interest is independent not only of other regressors but also of the error term, the textbook homoskedastic variance formula is valid even if the error term and auxiliary regressors exhibit a general dependence structure. In the context of randomized controlled trials, this conclusion holds in completely randomized experiments with constant treatment effects. When the error term is heteroscedastic with respect to the regressor of interest, the variance formula has to be adjusted not only for heteroscedasticity but also for correlation structure of the error term. However, even in the latter case, some simplifications are possible as only a part of the correlation structure of the error term should be taken into account. In the context of randomized control trials, this implies that the textbook homoscedastic variance formula is typically not valid if treatment effects are heterogenous but heteroscedasticity-robust variance formulas are valid if treatment effects are independent across units, even if the error term exhibits a general dependence structure. In addition, we extend the results to the case when the regressor of interest is assigned randomly at a group level, such as in randomized control trials with treatment assignment determined at a group (e.g., school/village) level. △ Less

Submitted 17 March, 2023; originally announced March 2023.

Comments: 27 pages

MSC Class: 62J20

arXiv:2212.13324 [pdf, ps, other]

Spectral and post-spectral estimators for grouped panel data models

Authors: Denis Chetverikov, Elena Manresa

Abstract: In this paper, we develop spectral and post-spectral estimators for grouped panel data models. Both estimators are consistent in the asymptotics where the number of observations $N$ and the number of time periods $T$ simultaneously grow large. In addition, the post-spectral estimator is $\sqrt{NT}$-consistent and asymptotically normal with mean zero under the assumption of well-separated groups ev… ▽ More In this paper, we develop spectral and post-spectral estimators for grouped panel data models. Both estimators are consistent in the asymptotics where the number of observations $N$ and the number of time periods $T$ simultaneously grow large. In addition, the post-spectral estimator is $\sqrt{NT}$-consistent and asymptotically normal with mean zero under the assumption of well-separated groups even if $T$ is growing much slower than $N$. The post-spectral estimator has, therefore, theoretical properties that are comparable to those of the grouped fixed-effect estimator developed by Bonhomme and Manresa (2015). In contrast to the grouped fixed-effect estimator, however, our post-spectral estimator is computationally straightforward. △ Less

Submitted 29 December, 2022; v1 submitted 26 December, 2022; originally announced December 2022.

Comments: 61 pages

MSC Class: 62J02

arXiv:2205.09691 [pdf, other]

High-dimensional Data Bootstrap

Authors: Victor Chernozhukov, Denis Chetverikov, Kengo Kato, Yuta Koike

Abstract: This article reviews recent progress in high-dimensional bootstrap. We first review high-dimensional central limit theorems for distributions of sample mean vectors over the rectangles, bootstrap consistency results in high dimensions, and key techniques used to establish those results. We then review selected applications of high-dimensional bootstrap: construction of simultaneous confidence sets… ▽ More This article reviews recent progress in high-dimensional bootstrap. We first review high-dimensional central limit theorems for distributions of sample mean vectors over the rectangles, bootstrap consistency results in high dimensions, and key techniques used to establish those results. We then review selected applications of high-dimensional bootstrap: construction of simultaneous confidence sets for high-dimensional vector parameters, multiple hypothesis testing via stepdown, post-selection inference, intersection bounds for partially identified parameters, and inference on best policies in policy evaluation. Finally, we also comment on a couple of future research directions. △ Less

Submitted 19 May, 2022; originally announced May 2022.

Comments: 27 pages; review article

arXiv:2203.03032 [pdf, other]

Weighted-average quantile regression

Authors: Denis Chetverikov, Yukun Liu, Aleh Tsyvinski

Abstract: In this paper, we introduce the weighted-average quantile regression framework, $\int_0^1 q_{Y|X}(u)ψ(u)du = X'β$, where $Y$ is a dependent variable, $X$ is a vector of covariates, $q_{Y|X}$ is the quantile function of the conditional distribution of $Y$ given $X$, $ψ$ is a weighting function, and $β$ is a vector of parameters. We argue that this framework is of interest in many applied settings a… ▽ More In this paper, we introduce the weighted-average quantile regression framework, $\int_0^1 q_{Y|X}(u)ψ(u)du = X'β$, where $Y$ is a dependent variable, $X$ is a vector of covariates, $q_{Y|X}$ is the quantile function of the conditional distribution of $Y$ given $X$, $ψ$ is a weighting function, and $β$ is a vector of parameters. We argue that this framework is of interest in many applied settings and develop an estimator of the vector of parameters $β$. We show that our estimator is $\sqrt T$-consistent and asymptotically normal with mean zero and easily estimable covariance matrix, where $T$ is the size of available sample. We demonstrate the usefulness of our estimator by applying it in two empirical settings. In the first setting, we focus on financial data and study the factor structures of the expected shortfalls of the industry portfolios. In the second setting, we focus on wage data and study inequality and social welfare dependence on commonly used individual characteristics. △ Less

Submitted 6 March, 2022; originally announced March 2022.

Comments: 69 pages

MSC Class: 62J02

arXiv:2104.04716 [pdf, other]

Selecting Penalty Parameters of High-Dimensional M-Estimators using Bootstrapping after Cross-Validation

Authors: Denis Chetverikov, Jesper Riis-Vestergaard Sørensen

Abstract: We develop a new method for selecting the penalty parameter for $\ell_1$-penalized M-estimators in high dimensions, which we refer to as bootstrapping after cross-validation. We derive rates of convergence for the corresponding $\ell_1$-penalized M-estimator and also for the post-$\ell_1$-penalized M-estimator, which refits the non-zero parameters of the former estimator without penalty in the cri… ▽ More We develop a new method for selecting the penalty parameter for $\ell_1$-penalized M-estimators in high dimensions, which we refer to as bootstrapping after cross-validation. We derive rates of convergence for the corresponding $\ell_1$-penalized M-estimator and also for the post-$\ell_1$-penalized M-estimator, which refits the non-zero parameters of the former estimator without penalty in the criterion function. We demonstrate via simulations that our method is not dominated by cross-validation in terms of estimation errors and outperforms cross-validation in terms of inference. As an illustration, we revisit Fryer Jr (2019), who investigated racial differences in police use of force, and confirm his findings. △ Less

Submitted 14 August, 2023; v1 submitted 10 April, 2021; originally announced April 2021.

Comments: 147 pages, 12 figures

arXiv:2012.09513 [pdf, ps, other]

Nearly optimal central limit theorem and bootstrap approximations in high dimensions

Authors: Victor Chernozhukov, Denis Chetverikov, Yuta Koike

Abstract: In this paper, we derive new, nearly optimal bounds for the Gaussian approximation to scaled averages of $n$ independent high-dimensional centered random vectors $X_1,\dots,X_n$ over the class of rectangles in the case when the covariance matrix of the scaled average is non-degenerate. In the case of bounded $X_i$'s, the implied bound for the Kolmogorov distance between the distribution of the sca… ▽ More In this paper, we derive new, nearly optimal bounds for the Gaussian approximation to scaled averages of $n$ independent high-dimensional centered random vectors $X_1,\dots,X_n$ over the class of rectangles in the case when the covariance matrix of the scaled average is non-degenerate. In the case of bounded $X_i$'s, the implied bound for the Kolmogorov distance between the distribution of the scaled average and the Gaussian vector takes the form $$C (B^2_n \log^3 d/n)^{1/2} \log n,$$ where $d$ is the dimension of the vectors and $B_n$ is a uniform envelope constant on components of $X_i$'s. This bound is sharp in terms of $d$ and $B_n$, and is nearly (up to $\log n$) sharp in terms of the sample size $n$. In addition, we show that similar bounds hold for the multiplier and empirical bootstrap approximations. Moreover, we establish bounds that allow for unbounded $X_i$'s, formulated solely in terms of moments of $X_i$'s. Finally, we demonstrate that the bounds can be further improved in some special smooth and zero-skewness cases. △ Less

Submitted 12 May, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

Comments: 60 pages. We corrected a mistake in v1. Lemmas 6.1-6.3 are reformulated for general rectangles

MSC Class: 60F05; 62E17

arXiv:1912.10529 [pdf, ps, other]

Improved Central Limit Theorem and bootstrap approximations in high dimensions

Authors: Victor Chernozhukov, Denis Chetverikov, Kengo Kato, Yuta Koike

Abstract: This paper deals with the Gaussian and bootstrap approximations to the distribution of the max statistic in high dimensions. This statistic takes the form of the maximum over components of the sum of independent random vectors and its distribution plays a key role in many high-dimensional econometric problems. Using a novel iterative randomized Lindeberg method, the paper derives new bounds for th… ▽ More This paper deals with the Gaussian and bootstrap approximations to the distribution of the max statistic in high dimensions. This statistic takes the form of the maximum over components of the sum of independent random vectors and its distribution plays a key role in many high-dimensional econometric problems. Using a novel iterative randomized Lindeberg method, the paper derives new bounds for the distributional approximation errors. These new bounds substantially improve upon existing ones and simultaneously allow for a larger class of bootstrap methods. △ Less

Submitted 29 May, 2022; v1 submitted 22 December, 2019; originally announced December 2019.

Comments: 63 pages

arXiv:1812.05490 [pdf, other]

Deep Face Image Retrieval: a Comparative Study with Dictionary Learning

Authors: Ahmad S. Tarawneh, Ahmad B. A. Hassanat, Ceyhun Celik, Dmitry Chetverikov, M. Sohel Rahman, Chaman Verma

Abstract: Facial image retrieval is a challenging task since faces have many similar features (areas), which makes it difficult for the retrieval systems to distinguish faces of different people. With the advent of deep learning, deep networks are often applied to extract powerful features that are used in many areas of computer vision. This paper investigates the application of different deep learning mode… ▽ More Facial image retrieval is a challenging task since faces have many similar features (areas), which makes it difficult for the retrieval systems to distinguish faces of different people. With the advent of deep learning, deep networks are often applied to extract powerful features that are used in many areas of computer vision. This paper investigates the application of different deep learning models for face image retrieval, namely, Alexlayer6, Alexlayer7, VGG16layer6, VGG16layer7, VGG19layer6, and VGG19layer7, with two types of dictionary learning techniques, namely $K$-means and $K$-SVD. We also investigate some coefficient learning techniques such as the Homotopy, Lasso, Elastic Net and SSF and their effect on the face retrieval system. The comparative results of the experiments conducted on three standard face image datasets show that the best performers for face image retrieval are Alexlayer7 with $K$-means and SSF, Alexlayer6 with $K$-SVD and SSF, and Alexlayer6 with $K$-means and SSF. The APR and ARR of these methods were further compared to some of the state of the art methods based on local descriptors. The experimental results show that deep learning outperforms most of those methods and therefore can be recommended for use in practice of face image retrieval △ Less

Submitted 13 December, 2018; originally announced December 2018.

arXiv:1811.09681 [pdf, other]

doi 10.3233/IDA-180895

Detailed Investigation of Deep Features with Sparse Representation and Dimensionality Reduction in CBIR: A Comparative Study

Authors: Ahmad S. Tarawneh, Ceyhun Celik, Ahmad B. Hassanat, Dmitry Chetverikov

Abstract: Research on content-based image retrieval (CBIR) has been under development for decades, and numerous methods have been competing to extract the most discriminative features for improved representation of the image content. Recently, deep learning methods have gained attention in computer vision, including CBIR. In this paper, we present a comparative investigation of different features, including… ▽ More Research on content-based image retrieval (CBIR) has been under development for decades, and numerous methods have been competing to extract the most discriminative features for improved representation of the image content. Recently, deep learning methods have gained attention in computer vision, including CBIR. In this paper, we present a comparative investigation of different features, including low-level and high-level features, for CBIR. We compare the performance of CBIR systems using different deep features with state-of-the-art low-level features such as SIFT, SURF, HOG, LBP, and LTP, using different dictionaries and coefficient learning techniques. Furthermore, we conduct comparisons with a set of primitive and popular features that have been used in this field, including colour histograms and Gabor features. We also investigate the discriminative power of deep features using certain similarity measures under different validation approaches. Furthermore, we investigate the effects of the dimensionality reduction of deep features on the performance of CBIR systems using principal component analysis, discrete wavelet transform, and discrete cosine transform. Unprecedentedly, the experimental results demonstrate high (95\% and 93\%) mean average precisions when using the VGG-16 FC7 deep features of Corel-1000 and Coil-20 datasets with 10-D and 20-D K-SVD, respectively. △ Less

Submitted 23 November, 2018; originally announced November 2018.

Journal ref: Intelligent Data Analysis, vol. 24, no. 1, 2020

arXiv:1806.01888 [pdf, other]

High-Dimensional Econometrics and Regularized GMM

Authors: Alexandre Belloni, Victor Chernozhukov, Denis Chetverikov, Christian Hansen, Kengo Kato

Abstract: This chapter presents key concepts and theoretical results for analyzing estimation and inference in high-dimensional models. High-dimensional models are characterized by having a number of unknown parameters that is not vanishingly small relative to the sample size. We first present results in a framework where estimators of parameters of interest may be represented directly as approximate means.… ▽ More This chapter presents key concepts and theoretical results for analyzing estimation and inference in high-dimensional models. High-dimensional models are characterized by having a number of unknown parameters that is not vanishingly small relative to the sample size. We first present results in a framework where estimators of parameters of interest may be represented directly as approximate means. Within this context, we review fundamental results including high-dimensional central limit theorems, bootstrap approximation of high-dimensional limit distributions, and moderate deviation theory. We also review key concepts underlying inference when many parameters are of interest such as multiple testing with family-wise error rate or false discovery rate control. We then turn to a general high-dimensional minimum distance framework with a special focus on generalized method of moments problems where we present results for estimation and inference about model parameters. The presented results cover a wide array of econometric applications, and we discuss several leading special cases including high-dimensional linear regression and linear instrumental variables models to illustrate the general results. △ Less

Submitted 10 June, 2018; v1 submitted 5 June, 2018; originally announced June 2018.

Comments: 104 pages, 4 figures

arXiv:1804.04602 [pdf]

Pilot Comparative Study of Different Deep Features for Palmprint Identification in Low-Quality Images

Authors: A. S. Tarawneh, D. Chetverikov, A. B. Hassanat

Abstract: Deep Convolutional Neural Networks (CNNs) are widespread, efficient tools of visual recognition. In this paper, we present a comparative study of three popular pre-trained CNN models: AlexNet, VGG-16 and VGG-19. We address the problem of palmprint identification in low-quality imagery and apply Support Vector Machines (SVMs) with all of the compared models. For the comparison, we use the MOHI palm… ▽ More Deep Convolutional Neural Networks (CNNs) are widespread, efficient tools of visual recognition. In this paper, we present a comparative study of three popular pre-trained CNN models: AlexNet, VGG-16 and VGG-19. We address the problem of palmprint identification in low-quality imagery and apply Support Vector Machines (SVMs) with all of the compared models. For the comparison, we use the MOHI palmprint image database whose images are characterized by low contrast, shadows, and varying illumination, scale, translation and rotation. Another, high-quality database called COEP is also considered to study the recognition gap between high-quality and low-quality imagery. Our experiments show that the deeper pre-trained CNN models, e.g., VGG-16 and VGG-19, tend to extract highly distinguishable features that recognize low-quality palmprints more efficiently than the less deep networks such as AlexNet. Furthermore, our experiments on the two databases using various models demonstrate that the features extracted from lower-level fully connected layers provide higher recognition rates than higher-layer features. Our results indicate that different pre-trained models can be efficiently used in touchless identification systems with low-quality palmprint images. △ Less

Submitted 9 April, 2018; originally announced April 2018.

Comments: 5 pages, 5 figures, Ninth Hungarian Conference on Computer Graphics and Geometry, Budapest, 2018

Journal ref: Ninth Hungarian Conference on Computer Graphics and Geometry, Budapest, 2018

arXiv:1711.10696 [pdf, ps, other]

Detailed proof of Nazarov's inequality

Authors: Victor Chernozhukov, Denis Chetverikov, Kengo Kato

Abstract: The purpose of this note is to provide a detailed proof of Nazarov's inequality stated in Lemma A.1 in Chernozhukov, Chetverikov, and Kato (2017, Annals of Probability). The purpose of this note is to provide a detailed proof of Nazarov's inequality stated in Lemma A.1 in Chernozhukov, Chetverikov, and Kato (2017, Annals of Probability). △ Less

Submitted 29 November, 2017; originally announced November 2017.

Comments: This note is designated only for arXiv

arXiv:1701.08687 [pdf, ps, other]

Double/Debiased/Neyman Machine Learning of Treatment Effects

Authors: Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey

Abstract: Chernozhukov, Chetverikov, Demirer, Duflo, Hansen, and Newey (2016) provide a generic double/de-biased machine learning (DML) approach for obtaining valid inferential statements about focal parameters, using Neyman-orthogonal scores and cross-fitting, in settings where nuisance parameters are estimated using a new generation of nonparametric fitting methods for high-dimensional data, called machin… ▽ More Chernozhukov, Chetverikov, Demirer, Duflo, Hansen, and Newey (2016) provide a generic double/de-biased machine learning (DML) approach for obtaining valid inferential statements about focal parameters, using Neyman-orthogonal scores and cross-fitting, in settings where nuisance parameters are estimated using a new generation of nonparametric fitting methods for high-dimensional data, called machine learning methods. In this note, we illustrate the application of this method in the context of estimating average treatment effects (ATE) and average treatment effects on the treated (ATTE) using observational data. A more general discussion and references to the existing literature are available in Chernozhukov, Chetverikov, Demirer, Duflo, Hansen, and Newey (2016). △ Less

Submitted 30 January, 2017; originally announced January 2017.

Comments: Conference paper, forthcoming in American Economic Review, Papers and Proceedings, 2017. arXiv admin note: text overlap with arXiv:1608.00060

arXiv:1608.00060 [pdf, other]

Double/Debiased Machine Learning for Treatment and Causal Parameters

Authors: Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, James Robins

Abstract: Most modern supervised statistical/machine learning (ML) methods are explicitly designed to solve prediction problems very well. Achieving this goal does not imply that these methods automatically deliver good estimators of causal parameters. Examples of such parameters include individual regression coefficients, average treatment effects, average lifts, and demand or supply elasticities. In fact,… ▽ More Most modern supervised statistical/machine learning (ML) methods are explicitly designed to solve prediction problems very well. Achieving this goal does not imply that these methods automatically deliver good estimators of causal parameters. Examples of such parameters include individual regression coefficients, average treatment effects, average lifts, and demand or supply elasticities. In fact, estimates of such causal parameters obtained via naively plugging ML estimators into estimating equations for such parameters can behave very poorly due to the regularization bias. Fortunately, this regularization bias can be removed by solving auxiliary prediction problems via ML tools. Specifically, we can form an orthogonal score for the target low-dimensional parameter by combining auxiliary and main ML predictions. The score is then used to build a de-biased estimator of the target parameter which typically will converge at the fastest possible 1/root(n) rate and be approximately unbiased and normal, and from which valid confidence intervals for these parameters of interest may be constructed. The resulting method thus could be called a "double ML" method because it relies on estimating primary and auxiliary predictive models. In order to avoid overfitting, our construction also makes use of the K-fold sample splitting, which we call cross-fitting. This allows us to use a very broad set of ML predictive methods in solving the auxiliary and main prediction problems, such as random forest, lasso, ridge, deep neural nets, boosted trees, as well as various hybrids and aggregators of these methods. △ Less

Submitted 12 December, 2017; v1 submitted 29 July, 2016; originally announced August 2016.

Comments: 71 pages, 2 figures

MSC Class: 62G

arXiv:1605.02214 [pdf, ps, other]

On cross-validated Lasso in high dimensions

Authors: Denis Chetverikov, Zhipeng Liao, Victor Chernozhukov

Abstract: In this paper, we derive non-asymptotic error bounds for the Lasso estimator when the penalty parameter for the estimator is chosen using $K$-fold cross-validation. Our bounds imply that the cross-validated Lasso estimator has nearly optimal rates of convergence in the prediction, $L^2$, and $L^1$ norms. For example, we show that in the model with the Gaussian noise and under fairly general assump… ▽ More In this paper, we derive non-asymptotic error bounds for the Lasso estimator when the penalty parameter for the estimator is chosen using $K$-fold cross-validation. Our bounds imply that the cross-validated Lasso estimator has nearly optimal rates of convergence in the prediction, $L^2$, and $L^1$ norms. For example, we show that in the model with the Gaussian noise and under fairly general assumptions on the candidate set of values of the penalty parameter, the estimation error of the cross-validated Lasso estimator converges to zero in the prediction norm with the $\sqrt{s\log p / n}\times \sqrt{\log(p n)}$ rate, where $n$ is the sample size of available data, $p$ is the number of covariates, and $s$ is the number of non-zero coefficients in the model. Thus, the cross-validated Lasso estimator achieves the fastest possible rate of convergence in the prediction norm up to a small logarithmic factor $\sqrt{\log(p n)}$, and similar conclusions apply for the convergence rate both in $L^2$ and in $L^1$ norms. Importantly, our results cover the case when $p$ is (potentially much) larger than $n$ and also allow for the case of non-Gaussian noise. Our paper therefore serves as a justification for the widely spread practice of using cross-validation as a method to choose the penalty parameter for the Lasso estimator. △ Less

Submitted 6 February, 2020; v1 submitted 7 May, 2016; originally announced May 2016.

arXiv:1512.07619 [pdf, ps, other]

Uniformly Valid Post-Regularization Confidence Regions for Many Functional Parameters in Z-Estimation Framework

Authors: Alexandre Belloni, Victor Chernozhukov, Denis Chetverikov, Ying Wei

Abstract: In this paper we develop procedures to construct simultaneous confidence bands for $\tilde p$ potentially infinite-dimensional parameters after model selection for general moment condition models where $\tilde p$ is potentially much larger than the sample size of available data, $n$. This allows us to cover settings with functional response data where each of the $\tilde p$ parameters is a functio… ▽ More In this paper we develop procedures to construct simultaneous confidence bands for $\tilde p$ potentially infinite-dimensional parameters after model selection for general moment condition models where $\tilde p$ is potentially much larger than the sample size of available data, $n$. This allows us to cover settings with functional response data where each of the $\tilde p$ parameters is a function. The procedure is based on the construction of score functions that satisfy certain orthogonality condition. The proposed simultaneous confidence bands rely on uniform central limit theorems for high-dimensional vectors (and not on Donsker arguments as we allow for $\tilde p \gg n$). To construct the bands, we employ a multiplier bootstrap procedure which is computationally efficient as it only involves resampling the estimated score functions (and does not require resolving the high-dimensional optimization problems). We formally apply the general theory to inference on regression coefficient process in the distribution regression model with a logistic link, where two implementations are analyzed in detail. Simulations and an application to real data are provided to help illustrate the applicability of the results. △ Less

Submitted 3 February, 2019; v1 submitted 23 December, 2015; originally announced December 2015.

Comments: 2 figures

arXiv:1507.05270 [pdf, other]

Nonparametric instrumental variable estimation under monotonicity

Authors: Denis Chetverikov, Daniel Wilhelm

Abstract: The ill-posedness of the inverse problem of recovering a regression function in a nonparametric instrumental variable model leads to estimators that may suffer from a very slow, logarithmic rate of convergence. In this paper, we show that restricting the problem to models with monotone regression functions and monotone instruments significantly weakens the ill-posedness of the problem. In stark co… ▽ More The ill-posedness of the inverse problem of recovering a regression function in a nonparametric instrumental variable model leads to estimators that may suffer from a very slow, logarithmic rate of convergence. In this paper, we show that restricting the problem to models with monotone regression functions and monotone instruments significantly weakens the ill-posedness of the problem. In stark contrast to the existing literature, the presence of a monotone instrument implies boundedness of our measure of ill-posedness when restricted to the space of monotone functions. Based on this result we derive a novel non-asymptotic error bound for the constrained estimator that imposes monotonicity of the regression function. For a given sample size, the bound is independent of the degree of ill-posedness as long as the regression function is not too steep. As an implication, the bound allows us to show that the constrained estimator converges at a fast, polynomial rate, independently of the degree of ill-posedness, in a large, but slowly shrinking neighborhood of constant functions. Our simulation study demonstrates significant finite-sample performance gains from imposing monotonicity even when the regression function is rather far from being a constant. We apply the constrained estimator to the problem of estimating gasoline demand functions from U.S. data. △ Less

Submitted 19 July, 2015; originally announced July 2015.

arXiv:1502.00352 [pdf, ps, other]

Empirical and multiplier bootstraps for suprema of empirical processes of increasing complexity, and related Gaussian couplings

Authors: Victor Chernozhukov, Denis Chetverikov, Kengo Kato

Abstract: We derive strong approximations to the supremum of the non-centered empirical process indexed by a possibly unbounded VC-type class of functions by the suprema of the Gaussian and bootstrap processes. The bounds of these approximations are non-asymptotic, which allows us to work with classes of functions whose complexity increases with the sample size. The construction of couplings is not of the H… ▽ More We derive strong approximations to the supremum of the non-centered empirical process indexed by a possibly unbounded VC-type class of functions by the suprema of the Gaussian and bootstrap processes. The bounds of these approximations are non-asymptotic, which allows us to work with classes of functions whose complexity increases with the sample size. The construction of couplings is not of the Hungarian type and is instead based on the Slepian-Stein methods and Gaussian comparison inequalities. The increasing complexity of classes of functions and non-centrality of the processes make the results useful for applications in modern nonparametric statistics (Giné and Nickl, 2015), in particular allowing us to study the power properties of nonparametric tests using Gaussian and bootstrap approximations. △ Less

Submitted 6 September, 2015; v1 submitted 1 February, 2015; originally announced February 2015.

arXiv:1412.3661 [pdf, ps, other]

Central Limit Theorems and Bootstrap in High Dimensions

Authors: Victor Chernozhukov, Denis Chetverikov, Kengo Kato

Abstract: This paper derives central limit and bootstrap theorems for probabilities that sums of centered high-dimensional random vectors hit hyperrectangles and sparsely convex sets. Specifically, we derive Gaussian and bootstrap approximations for probabilities $\Pr(n^{-1/2}\sum_{i=1}^n X_i\in A)$ where $X_1,\dots,X_n$ are independent random vectors in $\mathbb{R}^p$ and $A$ is a hyperrectangle, or, more… ▽ More This paper derives central limit and bootstrap theorems for probabilities that sums of centered high-dimensional random vectors hit hyperrectangles and sparsely convex sets. Specifically, we derive Gaussian and bootstrap approximations for probabilities $\Pr(n^{-1/2}\sum_{i=1}^n X_i\in A)$ where $X_1,\dots,X_n$ are independent random vectors in $\mathbb{R}^p$ and $A$ is a hyperrectangle, or, more generally, a sparsely convex set, and show that the approximation error converges to zero even if $p=p_n\to \infty$ as $n \to \infty$ and $p \gg n$; in particular, $p$ can be as large as $O(e^{Cn^c})$ for some constants $c,C>0$. The result holds uniformly over all hyperrectangles, or more generally, sparsely convex sets, and does not require any restriction on the correlation structure among coordinates of $X_i$. Sparsely convex sets are sets that can be represented as intersections of many convex sets whose indicator functions depend only on a small subset of their arguments, with hyperrectangles being a special case. △ Less

Submitted 8 March, 2016; v1 submitted 11 December, 2014; originally announced December 2014.

Comments: 43 pages; minor revision of the previous version

arXiv:1312.7614 [pdf, ps, other]

Inference on causal and structural parameters using many moment inequalities

Authors: Victor Chernozhukov, Denis Chetverikov, Kengo Kato

Abstract: This paper considers the problem of testing many moment inequalities where the number of moment inequalities, denoted by $p$, is possibly much larger than the sample size $n$. There is a variety of economic applications where solving this problem allows to carry out inference on causal and structural parameters, a notable example is the market structure model of Ciliberto and Tamer (2009) where… ▽ More This paper considers the problem of testing many moment inequalities where the number of moment inequalities, denoted by $p$, is possibly much larger than the sample size $n$. There is a variety of economic applications where solving this problem allows to carry out inference on causal and structural parameters, a notable example is the market structure model of Ciliberto and Tamer (2009) where $p=2^{m+1}$ with $m$ being the number of firms that could possibly enter the market. We consider the test statistic given by the maximum of $p$ Studentized (or $t$-type) inequality-specific statistics, and analyze various ways to compute critical values for the test statistic. Specifically, we consider critical values based upon (i) the union bound combined with a moderate deviation inequality for self-normalized sums, (ii) the multiplier and empirical bootstraps, and (iii) two-step and three-step variants of (i) and (ii) by incorporating the selection of uninformative inequalities that are far from being binding and a novel selection of weakly informative inequalities that are potentially binding but do not provide first order information. We prove validity of these methods, showing that under mild conditions, they lead to tests with the error in size decreasing polynomially in $n$ while allowing for $p$ being much larger than $n$, indeed $p$ can be of order $\exp (n^{c})$ for some $c > 0$. Importantly, all these results hold without any restriction on the correlation structure between $p$ Studentized statistics, and also hold uniformly with respect to suitably large classes of underlying distributions. Moreover, in the online supplement, we show validity of a test based on the block multiplier bootstrap in the case of dependent data under some general mixing conditions. △ Less

Submitted 18 October, 2018; v1 submitted 29 December, 2013; originally announced December 2013.

Comments: This paper was previously circulated under the title "Testing many moment inequalities"

arXiv:1303.7152 [pdf, ps, other]

doi 10.1214/14-AOS1235

Anti-concentration and honest, adaptive confidence bands

Authors: Victor Chernozhukov, Denis Chetverikov, Kengo Kato

Abstract: Modern construction of uniform confidence bands for nonparametric densities (and other functions) often relies on the classical Smirnov-Bickel-Rosenblatt (SBR) condition; see, for example, Giné and Nickl [Probab. Theory Related Fields 143 (2009) 569-596]. This condition requires the existence of a limit distribution of an extreme value type for the supremum of a studentized empirical process (equi… ▽ More Modern construction of uniform confidence bands for nonparametric densities (and other functions) often relies on the classical Smirnov-Bickel-Rosenblatt (SBR) condition; see, for example, Giné and Nickl [Probab. Theory Related Fields 143 (2009) 569-596]. This condition requires the existence of a limit distribution of an extreme value type for the supremum of a studentized empirical process (equivalently, for the supremum of a Gaussian process with the same covariance function as that of the studentized empirical process). The principal contribution of this paper is to remove the need for this classical condition. We show that a considerably weaker sufficient condition is derived from an anti-concentration property of the supremum of the approximating Gaussian process, and we derive an inequality leading to such a property for separable Gaussian processes. We refer to the new condition as a generalized SBR condition. Our new result shows that the supremum does not concentrate too fast around any value. We then apply this result to derive a Gaussian multiplier bootstrap procedure for constructing honest confidence bands for nonparametric density estimators (this result can be applied in other nonparametric problems as well). An essential advantage of our approach is that it applies generically even in those cases where the limit distribution of the supremum of the studentized empirical process does not exist (or is unknown). This is of particular importance in problems where resolution levels or other tuning parameters have been chosen in a data-driven fashion, which is needed for adaptive constructions of the confidence bands. Finally, of independent interest is our introduction of a new, practical version of Lepski's method, which computes the optimal, nonconservative resolution levels via a Gaussian multiplier bootstrap method. △ Less

Submitted 23 September, 2014; v1 submitted 28 March, 2013; originally announced March 2013.

Comments: Published in at http://dx.doi.org/10.1214/14-AOS1235 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS1235

Journal ref: Annals of Statistics 2014, Vol. 42, No. 5, 1787-1818

arXiv:1301.4807 [pdf, ps, other]

Comparison and anti-concentration bounds for maxima of Gaussian random vectors

Authors: Victor Chernozhukov, Denis Chetverikov, Kengo Kato

Abstract: Slepian and Sudakov-Fernique type inequalities, which compare expectations of maxima of Gaussian random vectors under certain restrictions on the covariance matrices, play an important role in probability theory, especially in empirical process and extreme value theories. Here we give explicit comparisons of expectations of smooth functions and distribution functions of maxima of Gaussian random v… ▽ More Slepian and Sudakov-Fernique type inequalities, which compare expectations of maxima of Gaussian random vectors under certain restrictions on the covariance matrices, play an important role in probability theory, especially in empirical process and extreme value theories. Here we give explicit comparisons of expectations of smooth functions and distribution functions of maxima of Gaussian random vectors without any restriction on the covariance matrices. We also establish an anti-concentration inequality for the maximum of a Gaussian random vector, which derives a useful upper bound on the Lévy concentration function for the Gaussian maximum. The bound is dimension-free and applies to vectors with arbitrary covariance matrices. This anti-concentration inequality plays a crucial role in establishing bounds on the Kolmogorov distance between maxima of Gaussian random vectors. These results have immediate applications in mathematical statistics. As an example of application, we establish a conditional multiplier central limit theorem for maxima of sums of independent random vectors where the dimension of the vectors is possibly much larger than the sample size. △ Less

Submitted 12 April, 2014; v1 submitted 21 January, 2013; originally announced January 2013.

Comments: 22 pages; discussions and references updated

MSC Class: 60G15; 60E15; 62E20

arXiv:1212.6906 [pdf, ps, other]

doi 10.1214/13-AOS1161

Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors

Authors: Victor Chernozhukov, Denis Chetverikov, Kengo Kato

Abstract: We derive a Gaussian approximation result for the maximum of a sum of high-dimensional random vectors. Specifically, we establish conditions under which the distribution of the maximum is approximated by that of the maximum of a sum of the Gaussian random vectors with the same covariance matrices as the original vectors. This result applies when the dimension of random vectors ($p$) is large compa… ▽ More We derive a Gaussian approximation result for the maximum of a sum of high-dimensional random vectors. Specifically, we establish conditions under which the distribution of the maximum is approximated by that of the maximum of a sum of the Gaussian random vectors with the same covariance matrices as the original vectors. This result applies when the dimension of random vectors ($p$) is large compared to the sample size ($n$); in fact, $p$ can be much larger than $n$, without restricting correlations of the coordinates of these vectors. We also show that the distribution of the maximum of a sum of the random vectors with unknown covariance matrices can be consistently estimated by the distribution of the maximum of a sum of the conditional Gaussian random vectors obtained by multiplying the original vectors with i.i.d. Gaussian multipliers. This is the Gaussian multiplier (or wild) bootstrap procedure. Here too, $p$ can be large or even much larger than $n$. These distributional approximations, either Gaussian or conditional Gaussian, yield a high-quality approximation to the distribution of the original maximum, often with approximation error decreasing polynomially in the sample size, and hence are of interest in many applications. We demonstrate how our Gaussian approximations and the multiplier bootstrap can be used for modern high-dimensional estimation, multiple hypothesis testing, and adaptive specification testing. All these results contain nonasymptotic bounds on approximation errors. △ Less

Submitted 22 January, 2018; v1 submitted 31 December, 2012; originally announced December 2012.

Comments: A minor typo has been corrected (last line, page 22, where \max_{j \in w} was missing)

Report number: IMS-AOS-AOS1161

Journal ref: Annals of Statistics 2013, Vol. 41, No. 6, 2786-2819

arXiv:1212.6885 [pdf, ps, other]

doi 10.1214/14-AOS1230

Gaussian approximation of suprema of empirical processes

Authors: Victor Chernozhukov, Denis Chetverikov, Kengo Kato

Abstract: This paper develops a new direct approach to approximating suprema of general empirical processes by a sequence of suprema of Gaussian processes, without taking the route of approximating whole empirical processes in the sup-norm. We prove an abstract approximation theorem applicable to a wide variety of statistical problems, such as construction of uniform confidence bands for functions. Notably,… ▽ More This paper develops a new direct approach to approximating suprema of general empirical processes by a sequence of suprema of Gaussian processes, without taking the route of approximating whole empirical processes in the sup-norm. We prove an abstract approximation theorem applicable to a wide variety of statistical problems, such as construction of uniform confidence bands for functions. Notably, the bound in the main approximation theorem is nonasymptotic and the theorem allows for functions that index the empirical process to be unbounded and have entropy divergent with the sample size. The proof of the approximation theorem builds on a new coupling inequality for maxima of sums of random vectors, the proof of which depends on an effective use of Stein's method for normal approximation, and some new empirical process techniques. We study applications of this approximation theorem to local and series empirical processes arising in nonparametric estimation via kernel and series methods, where the classes of functions change with the sample size and are non-Donsker. Importantly, our new technique is able to prove the Gaussian approximation for the supremum type statistics under weak regularity conditions, especially concerning the bandwidth and the number of series functions, in those examples. △ Less

Submitted 17 August, 2014; v1 submitted 31 December, 2012; originally announced December 2012.

Comments: This is the full version of the paper published in at http://dx.doi.org/10.1214/14-AOS1230 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS1230

Journal ref: Annals of Statistics 2014, Vol. 42, No. 4, 1564-1597

arXiv:1212.6757 [pdf, other]

doi 10.1017/S0266466618000282

Testing Regression Monotonicity in Econometric Models

Authors: Denis Chetverikov

Abstract: Monotonicity is a key qualitative prediction of a wide array of economic models derived via robust comparative statics. It is therefore important to design effective and practical econometric methods for testing this prediction in empirical analysis. This paper develops a general nonparametric framework for testing monotonicity of a regression function. Using this framework, a broad class of new t… ▽ More Monotonicity is a key qualitative prediction of a wide array of economic models derived via robust comparative statics. It is therefore important to design effective and practical econometric methods for testing this prediction in empirical analysis. This paper develops a general nonparametric framework for testing monotonicity of a regression function. Using this framework, a broad class of new tests is introduced, which gives an empirical researcher a lot of flexibility to incorporate ex ante information she might have. The paper also develops new methods for simulating critical values, which are based on the combination of a bootstrap procedure and new selection algorithms. These methods yield tests that have correct asymptotic size and are asymptotically nonconservative. It is also shown how to obtain an adaptive rate optimal test that has the best attainable rate of uniform consistency against models whose regression function has Lipschitz-continuous first-order derivatives and that automatically adapts to the unknown smoothness of the regression function. Simulations show that the power of the new tests in many cases significantly exceeds that of some prior tests, e.g. that of Ghosal, Sen, and Van der Vaart (2000). An application of the developed procedures to the dataset of Ellison and Ellison (2011) shows that there is some evidence of strategic entry deterrence in pharmaceutical industry where incumbents may use strategic investment to prevent generic entries when their patents expire. △ Less

Submitted 3 December, 2013; v1 submitted 30 December, 2012; originally announced December 2012.

Journal ref: Econom. Theory 35 (2019) 729-776

arXiv:1212.0442 [pdf, ps, other]

Some New Asymptotic Theory for Least Squares Series: Pointwise and Uniform Results

Authors: Alexandre Belloni, Victor Chernozhukov, Denis Chetverikov, Kengo Kato

Abstract: In applications it is common that the exact form of a conditional expectation is unknown and having flexible functional forms can lead to improvements. Series method offers that by approximating the unknown function based on $k$ basis functions, where $k$ is allowed to grow with the sample size $n$. We consider series estimators for the conditional mean in light of: (i) sharp LLNs for matrices der… ▽ More In applications it is common that the exact form of a conditional expectation is unknown and having flexible functional forms can lead to improvements. Series method offers that by approximating the unknown function based on $k$ basis functions, where $k$ is allowed to grow with the sample size $n$. We consider series estimators for the conditional mean in light of: (i) sharp LLNs for matrices derived from the noncommutative Khinchin inequalities, (ii) bounds on the Lebesgue factor that controls the ratio between the $L^\infty$ and $L_2$-norms of approximation errors, (iii) maximal inequalities for processes whose entropy integrals diverge, and (iv) strong approximations to series-type processes. These technical tools allow us to contribute to the series literature, specifically the seminal work of Newey (1997), as follows. First, we weaken the condition on the number $k$ of approximating functions used in series estimation from the typical $k^2/n \to 0$ to $k/n \to 0$, up to log factors, which was available only for spline series before. Second, we derive $L_2$ rates and pointwise central limit theorems results when the approximation error vanishes. Under an incorrectly specified model, i.e. when the approximation error does not vanish, analogous results are also shown. Third, under stronger conditions we derive uniform rates and functional central limit theorems that hold if the approximation error vanishes or not. That is, we derive the strong approximation for the entire estimate of the nonparametric function. We derive uniform rates, Gaussian approximations, and uniform confidence bands for a wide collection of linear functionals of the conditional expectation function. △ Less

Submitted 17 June, 2015; v1 submitted 3 December, 2012; originally announced December 2012.

Journal ref: Journal of Econometrics 186 (2015) 345-366

arXiv:1201.0167 [pdf, ps, other]

Adaptive Test of Conditional Moment Inequalities

Authors: Denis Chetverikov

Abstract: In this paper, I construct a new test of conditional moment inequalities, which is based on studentized kernel estimates of moment functions with many different values of the bandwidth parameter. The test automatically adapts to the unknown smoothness of moment functions and has uniformly correct asymptotic size. The test has high power in a large class of models with conditional moment inequaliti… ▽ More In this paper, I construct a new test of conditional moment inequalities, which is based on studentized kernel estimates of moment functions with many different values of the bandwidth parameter. The test automatically adapts to the unknown smoothness of moment functions and has uniformly correct asymptotic size. The test has high power in a large class of models with conditional moment inequalities. Some existing tests have nontrivial power against n^{-1/2}-local alternatives in a certain class of these models whereas my method only allows for nontrivial testing against (n/\log n)^{-1/2}-local alternatives in this class. There exist, however, other classes of models with conditional moment inequalities where the mentioned tests have much lower power in comparison with the test developed in this paper. △ Less

Submitted 5 January, 2012; v1 submitted 30 December, 2011; originally announced January 2012.

arXiv:1105.6154 [pdf, other]

Conditional Quantile Processes based on Series or Many Regressors

Authors: Alexandre Belloni, Victor Chernozhukov, Denis Chetverikov, Iván Fernández-Val

Abstract: Quantile regression (QR) is a principal regression method for analyzing the impact of covariates on outcomes. The impact is described by the conditional quantile function and its functionals. In this paper we develop the nonparametric QR-series framework, covering many regressors as a special case, for performing inference on the entire conditional quantile function and its linear functionals. In… ▽ More Quantile regression (QR) is a principal regression method for analyzing the impact of covariates on outcomes. The impact is described by the conditional quantile function and its functionals. In this paper we develop the nonparametric QR-series framework, covering many regressors as a special case, for performing inference on the entire conditional quantile function and its linear functionals. In this framework, we approximate the entire conditional quantile function by a linear combination of series terms with quantile-specific coefficients and estimate the function-valued coefficients from the data. We develop large sample theory for the QR-series coefficient process, namely we obtain uniform strong approximations to the QR-series coefficient process by conditionally pivotal and Gaussian processes. Based on these strong approximations, or couplings, we develop four resampling methods (pivotal, gradient bootstrap, Gaussian, and weighted bootstrap) that can be used for inference on the entire QR-series coefficient function. We apply these results to obtain estimation and inference methods for linear functionals of the conditional quantile function, such as the conditional quantile function itself, its partial derivatives, average partial derivatives, and conditional average partial derivatives. Specifically, we obtain uniform rates of convergence and show how to use the four resampling methods mentioned above for inference on the functionals. All of the above results are for function-valued parameters, holding uniformly in both the quantile index and the covariate value, and covering the pointwise case as a by-product. We demonstrate the practical utility of these results with an example, where we estimate the price elasticity function and test the Slutsky condition of the individual demand for gasoline, as indexed by the individual unobserved propensity for gasoline consumption. △ Less

Submitted 9 August, 2018; v1 submitted 30 May, 2011; originally announced May 2011.

Comments: 131 pages, 2 tables, 4 figures

Showing 1–33 of 33 results for author: Chetverikov, D