×

Semi-automated simultaneous predictor selection for regression-SARIMA models. (English) Zbl 1452.62502

Summary: Deciding which predictors to use plays an integral role in deriving statistical models in a wide range of applications. Motivated by the challenges of predicting events across a telecommunications network, we propose a semi-automated, joint model-fitting and predictor selection procedure for linear regression models. Our approach can model and account for serial correlation in the regression residuals, produces sparse and interpretable models and can be used to jointly select models for a group of related responses. This is achieved through fitting linear models under constraints on the number of nonzero coefficients using a generalisation of a recently developed mixed integer quadratic optimisation approach. The resultant models from our approach achieve better predictive performance on the motivating telecommunications data than methods currently used by industry.

MSC:

62J05 Linear regression; mixed models
62H12 Estimation in multivariate analysis
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)

References:

[1] Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) 2nd International Symposium on Information Theory, pp. 267-281. Budapest Akademiai Kiado (1973) · Zbl 0283.62006
[2] Beale, EML, Note on procedures for variable selection in multiple regression, Technometrics, 12, 4, 909-914 (1970)
[3] Berk, KN, Comparing subset regression procedures, Technometrics, 20, 1, 1-6 (1978) · Zbl 0371.62095
[4] Bertsimas, D.; King, A., OR forum-an algorithmic approach to linear regression, Oper. Res., 64, 1, 2-16 (2016) · Zbl 1338.90272
[5] Bertsimas, D.; King, A.; Muzumder, R., Best subset selection via a modern optimisation lens, Ann. Stat., 44, 813-852 (2016) · Zbl 1335.62115
[6] Breiman, L.; Friedman, JH, Predicting multivariate responses in a multiple linear regression, J. R. Stat. Soc. B, 59, 1, 3-54 (1997) · Zbl 0897.62068
[7] Brockwell, PJ; Davis, RA, Introduction to Time Series and Forecasting (2002), Berlin: Springer, Berlin · Zbl 0994.62085
[8] Caruana, R., Multitask learning, Mach. Learn., 28, 1, 41-75 (1997)
[9] Cochrane, D.; Orcutt, GH, Application of least squares regression to relationships containing auto-correlated error terms, J. Am. Stat. Assoc., 44, 245, 32-61 (1949) · Zbl 0033.08201
[10] Duong, L., Cohn, T., Bird, S., Cook, P.: Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers, pp. 845-850), Association for Computational Linguistics, Beijing (2015). 10.3115/v1/P15-2139, https://www.aclweb.org/anthology/P15-2139
[11] Gurobi Optimization, L.: Gurobi optimizer reference manual (2019). http://www.gurobi.com
[12] Hastie, T., Tibshirani, R.R.J. Tibshirani: Extended comparisons of best subset selection, forward stepwise selection, and the lasso (2017). arXiv Preprint arXiv:1707.08692
[13] Hastie, T.; Tibshirani, R.; Friedman, J., The Elements of Statistical Learning: Data Mining, Inference, and Prediction Springer Series in Statistics (2008), New York: Springer, New York
[14] Hazimeh, H., Mazumder, R.: Fast best subset selection: coordinate descent and local combinatorial optimization algorithms (2018). arXiv preprint arXiv:1803.01454
[15] Hocking, RR, A biometrics invited paper: the analysis and selection of variables in linear regression, Biometrics, 32, 1, 1-49 (1976) · Zbl 0328.62042
[16] Hoerl, E.; Kennard, RW, Ridge regression: biased estimation for nonorthogonal problems, Technometrics, 12, 1, 55-67 (1970) · Zbl 0202.17205
[17] Hyndman, RJ; Khandakar, Y., Automatic time series forecasting: the forecast package for R, J. Stat. Softw., 27, 3, 1-22 (2008)
[18] Izenman, AJ, Reduced-rank regression for the multivariate linear model, J. Multivar. Anal., 5, 248-264 (1975) · Zbl 0313.62042
[19] Jordan, MI; Mitchel, TM, Machine learning: trends, prespectives and prospects, Science, 349, 6245, 255-260 (2015) · Zbl 1355.68227
[20] Katal, A., Wazid, M., Goudar, R.H.: Big data: Issues, challenges, tools and good practices. In: Parashar, M., Zomaya, A., Chen, J., Cao, J.N., Bouvry, P., Prasad, S. (eds.) 2013 Sixth International Conference on Contemporary Computing (IC3). Jaypee Institute of Information Technology, IEEE (2013)
[21] Kronqvist, J.; Bernal, DE; Lundell, A.; Grossmann, IE, A review and comparison of solvers for convex MINLP, Optim. Eng., 20, 397-455 (2019)
[22] Lowther, A.P.: Multivariate response predictor selection methods: with applications to telecommunications time series data. PhD thesis, Department of Mathematics and Statistics, Lancaster University, UK (2019). https://eprints.lancs.ac.uk/id/eprint/141405/1/2019lowtherphd.pdf
[23] Mantel, N., Why stepdown procedures in variable selection, Technometrics, 12, 3, 621-625 (1970)
[24] Mazumder, R., Radchenko, P., Dedieu, A.: Subset selection with shrinkage: sparse linear modeling when the SNR is low (2017). arXiv preprint arXiv:1708.03288
[25] Miller, AJ, Subset Selections in Regression. Monographs on Statistics and Applied Probability (2002), Boca Raton: Chapman and Hall CRC, Boca Raton · Zbl 1051.62060
[26] Proost, F.; Fawcett, T., Data science and its relationship to big data and data-driven decision making, Big Data, 1, 1, 52-59 (2013)
[27] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2019). https://www.R-project.org/
[28] Rao, CR; Toutenburg, H., Linear Models: Least Squares and Alternatives (1999), Berlin: Springer, Berlin · Zbl 0943.62062
[29] Rawlings, JO; Pantula, SG; Dickey, DA, Applied Regression Analysis: A Research Tool (1998), Berlin: Springer, Berlin · Zbl 0909.62062
[30] Reinsel, GC; Velu, R., Multivariate Reduced-Rank Regression: Theory and Applications (2013), Berlin: Springer, Berlin
[31] Schwarz, G., Estimating the dimension of a model, Ann. Stat., 6, 2, 461-464 (1978) · Zbl 0379.62005
[32] Similia, T.; Tikka, J., Input selection and shrinkage in multiresponse linear regression, Comput. Stat. Data Anal., 52, 406-422 (2007) · Zbl 1452.62513
[33] Simon, N., Friedman, J., Hastie, T.: A blockwise descent algorithm for group-penalized multiresponse and multinomial regression (2013). arXiv Preprint arXiv:1311.6529v1
[34] Soltysik, RC; Yarnold, PR, Two-group multiODA: a mixed-integer linear programming solution with bounded M, Optim. Data Anal., 1, 30-37 (2010)
[35] Srivastava, MS; Solanky, TKS, Predicting multivariate response in linear regression model, Commun. Stat. Simul. Comput., 32, 2, 389-409 (2003) · Zbl 1075.62586
[36] Stone, M., Cross-validatory choice and assessment of statistical predictions, J. R. Stat. Soc. B, 36, 111-147 (1974) · Zbl 0308.62063
[37] Stroud, JR; Müller, P.; Sansó, B., Dynamic models for spatiotemporal data, J. R. Stat. Soc. B, 63, 4, 673-689 (2001) · Zbl 0986.62074
[38] Tibshirani, R., Regression shrinkage and selection via the LASSO, J. R. Stat. Soc. B, 58, 1, 267-288 (1996) · Zbl 0850.62538
[39] Turlach, BA; Venables, WN; Wright, SJ, Simultaneous variable selection, Technometrics, 47, 3, 349-363 (2005)
[40] Xie, W., Deng, X.: The CCP selector: scalable algorithms for sparse ridge regression from chance-constrained programming (2018). arXiv preprint arXiv:1806.03756
[41] Yuan, M.; Lin, Y., Model selection and estimation in regression with grouped variables, J. R. Stat. Soc. B, 68, 1, 49-67 (2006) · Zbl 1141.62030
[42] Zou, H.; Hastie, T., Regularization and variable selection via the elastic net, J. R. Stat. Soc. B, 67, 301-320 (2005) · Zbl 1069.62054
[43] Zou, H., Hastie, T.: elasticnet: elastic-net for sparse estimation and sparse PCA. R package version 1.1.1 (2018). https://CRAN.R-project.org/package=elasticnet
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.