×

Generalized conjugate gradient methods for \(\ell_1\) regularized convex quadratic programming with finite convergence. (English) Zbl 1432.90101

Summary: The conjugate gradient (CG) method is an efficient iterative method for solving large-scale strongly convex quadratic programming (QP). In this paper, we propose some generalized CG (GCG) methods for solving the \(\ell_1\)-regularized (possibly not strongly) convex QP that terminate at an optimal solution in a finite number of iterations. At each iteration, our methods first identify a face of an orthant and then either perform an exact line search along the direction of the negative projected minimum-norm subgradient of the objective function or execute a CG subroutine that conducts a sequence of CG iterations until a CG iterate crosses the boundary of this face or an approximate minimizer of over this face or a subface is found. We determine which type of step should be taken by comparing the magnitude of some components of the minimum-norm subgradient of the objective function to that of its rest components. Our analysis on finite convergence of these methods makes use of an error bound result and some key properties of the aforementioned exact line search and the CG subroutine. We also show that the proposed methods are capable of finding an approximate solution of the problem by allowing some inexactness on the execution of the CG subroutine. The overall arithmetic operation cost of our GCG methods for finding an \(\varepsilon \)-optimal solution depends on \(\varepsilon\) in \(O(\log (1/ \varepsilon))\), which is superior to the accelerated proximal gradient method [A. Beck and M. Teboulle, SIAM J. Imaging Sci. 2, No. 1, 183–202 (2009; Zbl 1175.94009); Yu. Nesterov, Math. Program. 140, No. 1 (B), 125–161 (2013; Zbl 1287.90067)], that depends on \(\varepsilon\) in \(O(1/\sqrt{\varepsilon})\). In addition, our GCG methods can be extended straightforwardly to solve box-constrained convex QP with finite convergence. Numerical results demonstrate that our methods are very favorable for solving ill-conditioned problems.

MSC:

90C20 Quadratic programming
65K05 Numerical mathematical programming methods
65Y20 Complexity and performance of numerical algorithms
62-08 Computational methods for problems pertaining to statistics
90C06 Large-scale problems in mathematical programming
90C25 Convex programming
11-06 Proceedings, conferences, collections, etc. pertaining to number theory

References:

[1] Afonso MV, Bioucas-Dias JM, Figueiredo MAT (2010) Fast image recovery using variable splitting and constrained optimization. IEEE T. Signal Proces. 19(9):2345-2356. · Zbl 1371.94018
[2] Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1):183-202. · Zbl 1175.94009 · doi:10.1137/080716542
[3] Becker SR, Candès EJ, Grant MC (2011) Templates for convex cone problems with applications to sparse signal recovery. Math. Prog. Comput. 3(3):165-218. · Zbl 1257.90042 · doi:10.1007/s12532-011-0029-5
[4] Bioucas-Dias JM, Figueiredo MAT (2007) A new TwIST: Two-step iterative shrinkage/thresholding algorithms for image restoration. IEEE T. Signal Proces. 16(12):2992-3004.
[5] Bruckstein AM, Donoho DL, Elad M (2009) From sparse solutions of systems of equations to sparse modeling of signals and images. SIAM Rev. 51(1):34-81. · Zbl 1178.68619 · doi:10.1137/060657704
[6] Byrd RH, Nocedal J, Solntsev S (2015) An algorithm for quadratic ℓ1-regularized optimization with a flexible active-set strategy. Optim. Method Softw. 30(6):1213-1237. · Zbl 1328.90097 · doi:10.1080/10556788.2015.1028062
[7] Candès E, Romberg J (2005) ℓ1-magic: Recovery of sparse signals via convex programming. User’s Guide, Applied and Computational Mathematics (California Institute of Technology, Pasadena, CA).
[8] Chen S, Donoho D, Saunders M (1998) Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20(1):33-61. · Zbl 0919.94002 · doi:10.1137/S1064827596304010
[9] Deng W, Yin W, Zhang Y (2011) Group sparse optimization by alternating direction method. CAAM Technical Report TR11-06, Rice University, Houston, TX.
[10] Donoho DL (1995) De-noising by soft-thresholding. IEEE Trans. Inform. Theory 41(3):613-627. · Zbl 0820.62002 · doi:10.1109/18.382009
[11] Dostal Z (1997) Box constrained quadratic programming with proportioning and projections. SIAM J. Optim. 7(3):871-887. · Zbl 0912.65052 · doi:10.1137/S1052623494266250
[12] Dostal Z, Schöberl J (2005) Minimizing quadratic functions subject to bound constrained with the rate of convergence and finite termination. Comp. Optim. Appl. 30(1):23-43. · Zbl 1071.65085 · doi:10.1023/B:COAP.0000049888.80264.25
[13] Fountoulakis K, Gondzio J (2016) A second-order method for strongly-convex ℓ1-regularization problems. Math. Prog. 156(1):189-219. · Zbl 1364.90255 · doi:10.1007/s10107-015-0875-4
[14] Goldstein T, Osher S (2009) The split Bregman method for ℓ1-regularized problems. SIAM J. Imaging Sci. 2(2):323-343. · Zbl 1177.65088 · doi:10.1137/080725891
[15] Hale ET, Yin W, Zhang Y (2010) Fixed-point continuation applied to compressed sensing: Implementation and numerical experiments. J. Comput. Math. 28(2):170-194. · Zbl 1224.65153
[16] Hastie T, Tibshirani R, Friedman JH (2001) The Elements in Statistical Learning: Data Mining, Inference, and Prediction (Springer, New York). · Zbl 0973.62007 · doi:10.1007/978-0-387-21606-5
[17] Hayami K (2001) On the behavior of the conjugate residual method for singular systems. NII Technical Report, National Institute of Informatics, Tokyo.
[18] Kaasschieter EF (1988) Preconditioned conjugate gradients for solving singular systems. J. Comput. Appl. Math. 24(1-2):265-275. · Zbl 0659.65031 · doi:10.1016/0377-0427(88)90358-5
[19] Kammerer WJ, Nashed MZ (1972) On the convergence of the conjugate gradient method for singular linear operator equations. SIAM J. Numer. Anal. 9(1):165-181. · Zbl 0243.65026 · doi:10.1137/0709016
[20] Kim S-J, Koh K, Lustig M, Boyd S, Gorinevsky D (2007) An interior-point method for large-scale ℓ1-regularized least squares. IEEE T. Signal Proces. 1(4):606-617. · doi:10.1109/JSTSP.2007.910971
[21] Li W (1995) Error bounds for piecewise convex quadratic programs and applications. SIAM J. Control Optim. 33(5):1510-1529. · Zbl 0836.90125 · doi:10.1137/S0363012993243022
[22] Milzarek A, Ulbrich M (2014) A semismooth Newton method with multidimensional filter globalization for l1-optimization. SIAM J. Optim. 24(1):298-333. · Zbl 1295.49022 · doi:10.1137/120892167
[23] Nesterov Y (2013) Gradient methods for minimizing composite functions. Math. Program. 140(1):125-161. · Zbl 1287.90067 · doi:10.1007/s10107-012-0629-5
[24] Nocedal J, Wright SJ (2006) Numerical Optimization, 2nd ed. (Springer, New York). · Zbl 1104.65059
[25] O’Leary DP (1980) A generalized conjugate gradient algorithm for solving a class of quadratic programming problems. Linear Alg. Appl. 34:371-399. · Zbl 0464.65039 · doi:10.1016/0024-3795(80)90173-1
[26] Robinson SM (1981) Some continuity properties of polyhedral multifunctions. Math. Program. Stud. 14:206-214. · Zbl 0449.90090 · doi:10.1007/BFb0120929
[27] Schmidt M (2010) Graphical Model Structure Learning with ℓ1-Regularization. PhD thesis, University of British Columbia, Vancouver.
[28] Sra S, Nowozin S, Wright SJ (2011) Optimization for Machine Learning (MIT Press, Cambridge, MA). · doi:10.7551/mitpress/8996.001.0001
[29] Steihaug T (1983) The conjugate gradient method and trust regions in large scale optimization. SIAM J. Numer. Anal. 20(3):626-637. · Zbl 0518.65042 · doi:10.1137/0720042
[30] Toint PL (1981) Towards an efficient sparsity exploiting Newton method for minimization. Duff IS, ed., Sparse Matrices and Their Uses (Academic Press, London), 57-88. · Zbl 0463.65045
[31] Wright SJ, Nowak R, Figueiredo MAT (2009) Sparse reconstruction by separable approximation. IEEE T. Signal Proces. 57(7):2479-2493. · Zbl 1391.94442 · doi:10.1109/TSP.2009.2016892
[32] Xiao L, Zhang T (2013) A proximal-gradient homotopy method for the sparse least-squares problem. SIAM J. Optim. 23(2):1062-1091. · Zbl 1280.65057 · doi:10.1137/120869997
[33] Yang J, Zhang Y (2010) Alternating direction algorithm for ℓ1-problems in compressive sensing. SIAM J. Sci. Comput. 33(1):250-278. · Zbl 1256.65060 · doi:10.1137/090777761
[34] Yun S, Toh K-C (2011) A coordinate gradient descent method for l1-regularized convex minimization. · Zbl 1220.90092 · doi:10.1007/s10589-009-9251-8
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.