×

Out-of-sample error estimation for M-estimators with convex penalty. (English) Zbl 07858900

Summary: A generic out-of-sample error estimate is proposed for \(M\)-estimators regularized with a convex penalty in high-dimensional linear regression where \((\boldsymbol{X}, \boldsymbol{y})\) is observed and the dimension \(p\) and sample size \(n\) are of the same order. The out-of-sample error estimate enjoys a relative error of order \(n^{-1/2}\) in a linear model with Gaussian covariates and independent noise, either non-asymptotically when \(p/n \leq \gamma\) or asymptotically in the high-dimensional asymptotic regime \(p/n\to\gamma^\prime\in(0, \infty)\). General differentiable loss functions \(\rho\) are allowed provided that the derivative of the loss is 1-Lipschitz; this includes the least-squares loss as well as robust losses such as the Huber loss and its smoothed versions. The validity of the out-of-sample error estimate holds either under a strong convexity assumption, or for the L1-penalized Huber M-estimator and the Lasso under a sparsity assumption and a bound on the number of contaminated observations. For the square loss and in the absence of corruption in the response, the results additionally yield \(n^{-1/2}\)-consistent estimates of the noise variance and of the generalization error. This generalizes, to arbitrary convex penalty and arbitrary covariance, estimates that were previously known for the Lasso.

MSC:

62Hxx Multivariate analysis