×

Subspace Newton method for sparse group \(\ell_0\) optimization problem. (English) Zbl 07914899

Summary: This paper investigates sparse optimization problems characterized by a sparse group structure, where element- and group-level sparsity are jointly taken into account. This particular optimization model has exhibited notable efficacy in tasks such as feature selection, parameter estimation, and the advancement of model interpretability. Central to our study is the scrutiny of the \(\ell_0\) and \(\ell_{2,0}\) norm regularization model, which, in comparison to alternative surrogate formulations, presents formidable computational challenges. We embark on our study by conducting the analysis of the optimality conditions of the sparse group optimization problem, leveraging the notion of a \(\gamma\)-stationary point, whose linkage to local and global minimizer is established. In a subsequent facet of our study, we develop a novel subspace Newton algorithm for sparse group \(\ell_0\) optimization problem and prove its global convergence property as well as local second-order convergence rate. Experimental results reveal the superlative performance of our algorithm in terms of both precision and computational expediency, thereby outperforming several state-of-the-art solvers.

MSC:

90C26 Nonconvex programming, global optimization
90C46 Optimality conditions and duality in mathematical programming
49M15 Newton-type methods
Full Text: DOI

References:

[1] Candes, EJ; Tao, T., Decoding by linear programming, IEEE Trans. Inf. Theory, 51, 12, 4203-4215, 2005 · Zbl 1264.94121 · doi:10.1109/TIT.2005.858979
[2] Simon, N.; Friedman, J.; Hastie, T.; Tibshirani, R., A sparse-group lasso, J. Comput. Graph. Stat., 22, 2, 231-245, 2013 · doi:10.1080/10618600.2012.681250
[3] Zhang, P.; Wang, R.; Xiu, N., Multinomial logistic regression classifier via \(\ell_{q,0}\)-proximal newton algorithm, Neurocomputing, 468, 148-164, 2021 · doi:10.1016/j.neucom.2021.10.005
[4] Lin, D.; Zhang, J.; Li, J.; Calhoun, VD; Deng, H-W; Wang, Y-P, Group sparse canonical correlation analysis for genomic data integration, BMC Bioinform., 14, 1, 1-16, 2013 · doi:10.1186/1471-2105-14-245
[5] Li, J.; Dong, W.; Meng, D., Grouped gene selection of cancer via adaptive sparse group lasso based on conditional mutual information, IEEE ACM Trans. Comput. Biol. Bioinform., 15, 6, 2028-2038, 2018 · doi:10.1109/TCBB.2017.2761871
[6] Hu, Y., Lu, J., Yang, X., Zhang, K.: Mix sparse optimization: theory and algorithm (2022). https://www.polyu.edu.hk/ama/profile/xqyang/mix_sparse2022.pdf
[7] Li, Y.; Nan, B.; Zhu, J., Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure, Biometrics, 71, 2, 354-363, 2015 · Zbl 1390.62285 · doi:10.1111/biom.12292
[8] Matsuoka, R.; Kyochi, S.; Ono, S.; Okuda, M., Joint sparsity and order optimization based on admm with non-uniform group hard thresholding, IEEE Trans. Circuits Syst. I Regul. Pap., 65, 5, 1602-1613, 2017 · doi:10.1109/TCSI.2017.2763969
[9] Pan, L.; Chen, X., Group sparse optimization for images recovery using capped folded concave functions, SIAM J. Imaging Sci., 14, 1, 1-25, 2021 · Zbl 1474.90507 · doi:10.1137/19M1304799
[10] Li, W.; Bian, W.; Toh, K-C, Difference-of-convex algorithms for a class of sparse group \(\ell_0\) regularized optimization problems, SIAM J. Optim., 32, 3, 1614-1641, 2022 · Zbl 1496.90097 · doi:10.1137/21M1443455
[11] Chen, J.; Dai, G.; Zhang, N., An application of sparse-group lasso regularization to equity portfolio optimization and sector selection, Ann. Oper. Res., 284, 243-262, 2020 · Zbl 1434.62220 · doi:10.1007/s10479-019-03189-z
[12] Bunea, F.; Tsybakov, A.; Wegkamp, M., Sparsity oracle inequalities for the lasso, Electron. J. Stat., 1, 169-194, 2007 · Zbl 1146.62028 · doi:10.1214/07-EJS008
[13] Beck, A.; Teboulle, M., A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sci., 2, 1, 183-202, 2009 · Zbl 1175.94009 · doi:10.1137/080716542
[14] Daubechies, I.; Defrise, M.; De Mol, C., An iterative thresholding algorithm for linear inverse problems with a sparsity constraint, Commun. Pure Appl. Math., 57, 11, 1413-1457, 2004 · Zbl 1077.65055 · doi:10.1002/cpa.20042
[15] Wang, H.; Shao, Y.; Zhou, S.; Zhang, C.; Xiu, N., Support vector machine classifier via \(l_{0/1}\) soft-margin loss, IEEE Trans. Pattern Anal. Mach. Intell., 44, 7253-7265, 2019 · doi:10.1109/TPAMI.2021.3092177
[16] Beck, A.; Vaisbourd, Y., The sparse principal component analysis problem: optimality conditions and algorithms, J. Optim. Theory Appl., 170, 119-143, 2015 · Zbl 1376.90061 · doi:10.1007/s10957-016-0934-x
[17] Zhou, S.; Luo, Z.; Xiu, N.; Li, GY, Computing one-bit compressive sensing via double-sparsity constrained optimization, IEEE Trans. Signal Process., 70, 1593-1608, 2021 · Zbl 07911354 · doi:10.1109/TSP.2022.3156911
[18] Shen, X.; Pan, W.; Zhu, Y.; Zhou, H., On constrained and regularized high-dimensional regression, Ann. Inst. Stat. Math., 65, 807-832, 2013 · Zbl 1329.62307 · doi:10.1007/s10463-012-0396-3
[19] Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Proceedings of the 27th Asilomar Conference on Signals Systems, and Computers, pp. 40-44 (1993)
[20] Needell, D.; Tropp, JA, Cosamp: iterative signal recovery from incomplete and inaccurate samples, Appl. Comput. Harmon. Anal., 26, 3, 301-321, 2009 · Zbl 1163.94003 · doi:10.1016/j.acha.2008.07.002
[21] Blumensath, T.; Davies, ME, Normalized iterative hard thresholding: guaranteed stability and performance, IEEE J. Sel. Top. Signal Process., 4, 2, 298-309, 2010 · doi:10.1109/JSTSP.2010.2042411
[22] Beck, A.; Eldar, YC, Sparsity constrained nonlinear optimization: optimality conditions and algorithms, SIAM J. Optim., 23, 1480-1509, 2013 · Zbl 1295.90051 · doi:10.1137/120869778
[23] Yuan, X.; Li, P.; Zhang, T., Gradient hard thresholding pursuit, J. Mach. Learn. Res., 18, 1, 6027-6069, 2017 · Zbl 1471.90114
[24] Zhou, S.; Xiu, N.; Qi, H., Global and quadratic convergence of newton hard-thresholding pursuit, J. Mach. Learn. Res., 22, 12-11245, 2019 · Zbl 1539.65071
[25] Blumensath, T.; Davies, ME, Iterative thresholding for sparse approximations, J. Fourier Anal. Appl., 14, 629-654, 2008 · Zbl 1175.94060 · doi:10.1007/s00041-008-9035-z
[26] Soubies, E.; Blanc-Féraud, L.; Aubert, G., A continuous exact \(\ell_0\) penalty (cel0) for least squares regularized problem, SIAM J. Imaging Sci., 8, 3, 1607-1639, 2015 · Zbl 1325.65086 · doi:10.1137/151003714
[27] Bertsimas, D.; King, A.; Mazumder, R., Best subset selection via a modern optimization lens, Ann. Stat., 44, 2, 813-852, 2016 · Zbl 1335.62115 · doi:10.1214/15-AOS1388
[28] Cheng, W.; Chen, Z.; Hu, Q., An active set Barzilar-Borwein algorithm for \(\ell_0\) regularized optimization, J. Glob. Optim., 76, 4, 769-791, 2020 · Zbl 1451.90097 · doi:10.1007/s10898-019-00830-w
[29] Bian, W.; Chen, X., A smoothing proximal gradient algorithm for nonsmooth convex regression with cardinality penalty, SIAM J. Numer. Anal., 58, 1, 858-883, 2020 · doi:10.1137/18M1186009
[30] Ito, K.; Kunisch, K., A variational approach to sparsity optimization based on Lagrange multiplier theory, Inverse Probl., 30, 1, 2013 · Zbl 1292.65070 · doi:10.1088/0266-5611/30/1/015001
[31] Huang, J.; Jiao, Y.; Liu, Y.; Lu, X., A constructive approach to l0 penalized regression, J. Mach. Learn. Res., 19, 1, 403-439, 2018 · Zbl 1444.62091
[32] Zhou, S.; Pan, L.; Xiu, N., Newton method for \(\ell_0\)-regularized optimization, Numer. Algorithms, 88, 1541-1570, 2021 · Zbl 1482.65092 · doi:10.1007/s11075-021-01085-x
[33] Nocedal, J., Wright, S.J.: Numerical optimization. In: Fundamental Statistical Inference (2018)
[34] Facchinei, F., Minimization of sc1 functions and the Maratos effect, Oper. Res. Lett., 17, 3, 131-137, 1995 · Zbl 0843.90108 · doi:10.1016/0167-6377(94)00059-F
[35] Yang, J., Leung, H.C.M., Yiu, S.-M., Cai, Y., Chin, F.Y.L.: Intra- and inter-sparse multiple output regression with application on environmental microbial community study. In: 2013 IEEE International Conference on Bioinformatics Biomedicine, pp. 404-409 (2013)
[36] Esser, E.; Lou, Y.; Xin, J., A method for finding structured sparse solutions to nonnegative least squares problems with applications, SIAM J. Imaging Sci., 6, 4, 2010-2046, 2013 · Zbl 1282.90239 · doi:10.1137/13090540X
[37] Jiao, Y.; Jin, B.; Lu, X., Group sparse recovery via the \(\ell^0(\ell^2)\) penalty: theory and algorithm, IEEE Trans. Signal Process., 65, 998-1012, 2016 · Zbl 1414.94280 · doi:10.1109/TSP.2016.2630028
[38] Eldar, YC; Kuppinger, P.; Bölcskei, H., Block-sparse signals: uncertainty relations and efficient recovery, IEEE Trans. Signal Process., 58, 3042-3054, 2009 · Zbl 1392.94195 · doi:10.1109/TSP.2010.2044837
[39] Berg, E.; Friedlander, MP, Probing the pareto frontier for basis pursuit solutions, SIAM J. Sci. Comput., 31, 890-912, 2008 · Zbl 1193.49033 · doi:10.1137/080714488
[40] Huang, J.; Breheny, PJ; Ma, S., A selective review of group selection in high-dimensional models, Stat. Sci., 27, 4, 2012 · Zbl 1331.62347 · doi:10.1214/12-STS392
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.