×

A reduced-space algorithm for minimizing \(\ell_1\)-regularized convex functions. (English) Zbl 1369.90103

Summary: We present a new method for minimizing the sum of a differentiable convex function and an \(\ell_1\)-norm regularizer. The main features of the new method include: (i) an evolving set of indices corresponding to variables that are predicted to be nonzero at a solution (i.e., the support); (ii) a reduced-space subproblem defined in terms of the predicted support; (iii) conditions that determine how accurately each subproblem must be solved, which allow for Newton, linear conjugate gradient, and coordinate-descent techniques to be employed; (iv) a computationally practical condition that determines when the predicted support should be updated; and (v) a reduced proximal gradient step that ensures sufficient decrease in the objective function when it is decided that variables should be added to the predicted support. We prove a convergence guarantee for our method and demonstrate its efficiency on a large set of model prediction problems.

MSC:

90C06 Large-scale problems in mathematical programming
90C25 Convex programming
90C30 Nonlinear programming
90C55 Methods of successive quadratic programming type
90C90 Applications of mathematical programming
49J52 Nonsmooth analysis
49M37 Numerical methods based on nonlinear programming
62M20 Inference from stochastic processes and prediction
65K05 Numerical mathematical programming methods

Software:

CONV_QP; LBFGS-B; GPDT; TRON

References:

[1] G. Andrew and J. Gao, {\it Scalable training of \(l_1\)-regularized log-linear models}, in Proceedings of the 24th International Conference on Machine Learning, 2007, pp. 33-40.
[2] A. Beck and M. Teboulle, {\it A fast iterative shrinkage-thresholding algorithm for linear inverse problems}, SIAM J. Imaging Sci., 2 (2009), pp. 183-202. · Zbl 1175.94009
[3] R. H. Byrd, G. M. Chin, J. Nocedal, and F. Oztoprak, {\it A family of second-order methods for convex \(ℓ_1\)-regularized optimization}, Math. Program., 159 (2016), pp. 435-467. · Zbl 1350.49046
[4] R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu, {\it A limited memory algorithm for bound constrained optimization}, SIAM J. Sci. Comput., 16 (1995), pp. 1190-1208. · Zbl 0836.65080
[5] R. H. Byrd, J. Nocedal, and F. Oztoprak, {\it An inexact successive quadratic approximation method for \(ℓ_1\) regularized optimization}, Math. Program., 157 (2016), pp. 375-396. · Zbl 1342.49037
[6] F. E. Curtis, Z. Han, and D. P. Robinson, {\it A globally convergent primal-dual active-set framework for large-scale convex quadratic optimization}, Comput. Oper. Appl., 60 (2015), pp. 311-341. · Zbl 1309.90072
[7] Z. Dostál, {\it Box constrained quadratic programming with proportioning and projections}, SIAM J. Optim., 7 (1997), pp. 871-887, . · Zbl 0912.65052
[8] Z. Dostál, {\it A proportioning based algorithm with rate of convergence for bound constrained quadratic programming}, Numer. Algorithms, 34 (2003), pp. 293-302, . · Zbl 1037.65061
[9] Z. Dostál and J. Schoberl, {\it Minimizing quadratic functions subject to bound constraints with the rate of convergence and finite termination}, Comput. Oper. Appl., 30 (2005), pp. 23-43, . · Zbl 1071.65085
[10] E. Elhamifar and R. Vidal, {\it Sparse subspace clustering}, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2009, pp. 2790-2797.
[11] A. Friedlander, J. M. Martínez, and M. Raydon, {\it A new method for large-scale box constrained convex quadratic minimization problems}, Optim. Meth. Softw., 5 (1995), pp. 57-74.
[12] A. Friedlander and J. M. Martínez, {\it On the numerical solution of bound constrained optimization problems}, Revue française d’automatique, d’informatique et de recherche opérationnelle. Rech. Oper., 23 (1989), pp. 319-341. · Zbl 0683.90073
[13] A. Friedlander and J. M. Martinez, {\it On the maximization of a concave quadratic function with box constraints}, SIAM J. Optim., 4 (1994), pp. 177-192. · Zbl 0801.65058
[14] M. D. Gonzalez-Lima, W. W. Hager, and H. Zhang, {\it An affine-scaling interior-point method for continuous knapsack constraints}, SIAM J. Optim., 21 (2011), pp. 361-390. · Zbl 1218.90114
[15] W. W. Hager and H. Zhang, {\it A new active set algorithm for box constrained optimization}, SIAM J. Optim., 17 (2006), pp. 526-557. · Zbl 1165.90570
[16] C.-J. Hsieh, I. S. Dhillon, P. K. Ravikumar, and M. A. Sustik, {\it Sparse inverse covariance matrix estimation using quadratic approximation}, in Advances in Neural Information Processing Systems, 2011, pp. 2330-2338.
[17] T. Joachims, {\it Making large-scale support vector machine learning practical}, Advances in Kernel Methods: Support Vector Machines, 1999.
[18] N. Keskar, J. Nocedal, F. Öztoprak, and A. Wächter, {\it A second-order method for convex 1-regularized optimization with active-set prediction}, Optim. Meth. Softw., 31 (2016), pp. 605-621. · Zbl 1341.49039
[19] M. Ko, J. Zowe, et al., {\it An iterative two-step algorithm for linear complementarity problems}, Numer. Math., 68 (1994), pp. 95-106. · Zbl 0820.65033
[20] J. Lee, Y. Sun, and M. Saunders, {\it Proximal newton-type methods for convex optimization}, in Advances in Neural Information Processing Systems, 2012, pp. 836-844.
[21] C.-J. Lin and J. J. Moré, {\it Newton’s method for large bound-constrained optimization problems}, SIAM J. Opti., 9 (1999), pp. 1100-1127. · Zbl 0957.65064
[22] H. Mohy-ud-Din and D. P. Robinson, {\it A solver for nonconvex bound-constrained quadratic optimization}, SIAM J. Optim., 25 (2015), pp. 2385-2407, . · Zbl 1330.49029
[23] J. J. Moré and G. Toraldo, {\it On the solution of large quadratic programming problems with bound constraints}, SIAM J. Optim., 1 (1991), pp. 93-113. · Zbl 0752.90053
[24] D. P. Robinson, L. Feng, J. M. Nocedal, and J.-S. Pang, {\it Subspace accelerated matrix splitting algorithms for asymmetric and symmetric linear complementarity problems}, SIAM J. Optim., 23 (2013), pp. 1371-1397. · Zbl 1291.49023
[25] K. Scheinberg and X. Tang, {\it Practical inexact proximal quasi-newton method with global complexity analysis}, preprint (2013). · Zbl 1366.90166
[26] T. Serafini, G. Zanghirati, and L. Zanni, {\it Gradient projection methods for quadratic programs and applications in training support vector machines}, Optim. Meth. Softw., 20 (2005), pp. 353-378. · Zbl 1072.90026
[27] T. Serafini and L. Zanni, {\it On the working set selection in gradient projection-based decomposition techniques for support vector machines}, Optim. Meth. Softw., 20 (2005), pp. 583-596. · Zbl 1116.90115
[28] H. M. ud Din and D. P. Robinson, {\it A solver for nonconvex bound-constrained quadratic optimization}, SIAM J. Optim., 25 (2015), pp. 2385-2407, . · Zbl 1330.49029
[29] S. J. Wright, R. D. Nowak, and M. A. Figueiredo, {\it Sparse reconstruction by separable approximation}, IEEE Trans. Signal Process., 57 (2009), pp. 2479-2493. · Zbl 1391.94442
[30] C. Yang, D. P. Robinson, and R. Vidal, {\it Sparse subspace clustering with missing entries}, in International Conference on Machine Learning, 2015, pp. 2463-2472.
[31] C. You, C.-G. Li, D. P. Robinson, and R. Vidal, {\it Oracle based active set algorithm for scalable elastic net subspace clustering}, in IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[32] C. You, D. P. Robinson, and R. Vidal, {\it Sparse subspace clustering by orthogonal matching pursuit}, in IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[33] G.-X. Yuan, C.-H. Ho, and C.-J. Lin, {\it An improved glmnet for l \(1\)-regularized logistic regression}, J. Mach. Learn. Res., 13 (2012), pp. 1999-2030. · Zbl 1432.68404
[34] L. Zanni, {\it An improved gradient projection-based decomposition technique for support vector machines}, Comput. Manag. Sci., 3 (2006), pp. 131-145. · Zbl 1163.90693
[35] L. Zanni, T. Serafini, and G. Zanghirati, {\it Parallel software for training large scale support vector machines on multiprocessor systems}, J. Mach. Learn. Res., 7 (2006), pp. 1467-1492. · Zbl 1222.68341
[36] Q. Zhu, E. Izumchenko, A. M. Aliper, E. Makarev, K. Paz, A. A. Buzdin, A. A. Zhavoronkov, and D. Sidransky, {\it Pathway activation strength is a novel independent prognostic biomarker for cetuximab sensitivity in colorectal cancer patients}, Hum. Gen. Var., 2 (2015).
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.