×

A constrained minimum method for model selection. (English) Zbl 07851327

Summary: We propose a constrained minimum method for converting a hypothesis test into a model selection criterion that pursues consistency and sparsity of the selected model explicitly. The method achieves consistency by letting the significance level of the test go to zero at a certain speed depending on the sample size. It maximizes the sparsity by choosing the most sparse model among models not rejected by the test. The method may be used for model selection whenever a hypothesis test on the model parameter vector is available. We illustrate this method through its application to the best subset selection of linear models. Numerical comparisons with existing methods show that it has excellent accuracy and its selected model converges to the true model faster than the model chosen by the Bayesian information criterion.
{© 2021 John Wiley & Sons, Ltd.}

MSC:

62-XX Statistics

Software:

ElemStatLearn; leaps
Full Text: DOI

References:

[1] Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716-723. · Zbl 0314.62039
[2] Bertsimas, D., King, A., & Mazumder, R. (2016). Best subset selection via a modern optimization lens. Annals of Statistics, 44, 813-852. · Zbl 1335.62115
[3] Ding, J., Tarokh, V., & Yang, Y. (2018). Model selection techniques: An overview. IEEE Signal Processing Magazine, 35, 16-34.
[4] Hastie, T., Tibshirani, R., & Friedman, J. (2009). Elements of statistical learning: Data mining, inference, and predictions(2nd ed.). Springer Verlag. · Zbl 1273.62005
[5] Hastie, T., Tibshirani, R., & Wainwright, M. (2015). Statistical learning with sparsity(2nd ed.).: CRC Press. · Zbl 1319.68003
[6] Hodges, J. L., & Lehmann, E. L. (1983). Hodges-Lehmann estimators. Encyclopedia of Statistical Sciences, 3, 463-465.
[7] Kadane, J. B., & Lazar, N. A. (2004). Methods and criteria for model selection. Journal of the American Statistical Association, 99, 279-290. · Zbl 1089.62501
[8] Knight, K., & Fu, W. (2000). Asymptotics for lasso‐type estimators. Annals of Statistics, 28, 1356-1378. · Zbl 1105.62357
[9] Lumley, T. (2020). R package ‘leaps’. Available at https://cran.r-project.org
[10] Mallows, C. L. (1973). Some comments on C_p. Technometrics, 15, 661-675. · Zbl 0269.62061
[11] Miller, A. J. (1990). Subset selection in regression: Chapman and Hall. · Zbl 0702.62057
[12] Schwarz, G. E. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461-464. · Zbl 0379.62005
[13] Stamey, T., Kabalin, J., McNeal, J., Johnstone, I., Freiha, F., Redwine, E., & Yang, N. (1989). Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate. II. Radical prostatectomy treated patients. Journal of Urology, 16, 1076-1083.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.