subscribe to arXiv mailings

Block Majorization Minimization with Extrapolation and Application to $β$-NMF

Authors: Le Thi Khanh Hien, Valentin Leplat, Nicolas Gillis

Abstract: We propose a Block Majorization Minimization method with Extrapolation (BMMe) for solving a class of multi-convex optimization problems. The extrapolation parameters of BMMe are updated using a novel adaptive update rule. By showing that block majorization minimization can be reformulated as a block mirror descent method, with the Bregman divergence adaptively updated at each iteration, we establi… ▽ More We propose a Block Majorization Minimization method with Extrapolation (BMMe) for solving a class of multi-convex optimization problems. The extrapolation parameters of BMMe are updated using a novel adaptive update rule. By showing that block majorization minimization can be reformulated as a block mirror descent method, with the Bregman divergence adaptively updated at each iteration, we establish subsequential convergence for BMMe. We use this method to design efficient algorithms to tackle nonnegative matrix factorization problems with the $β$-divergences ($β$-NMF) for $β\in [1,2]$. These algorithms, which are multiplicative updates with extrapolation, benefit from our novel results that offer convergence guarantees. We also empirically illustrate the significant acceleration of BMMe for $β$-NMF through extensive experiments. △ Less

Submitted 12 January, 2024; originally announced January 2024.

Comments: 23 pages, code available from https://github.com/vleplat/BMMe

arXiv:2309.08249 [pdf, other]

Deep Nonnegative Matrix Factorization with Beta Divergences

Authors: Valentin Leplat, Le Thi Khanh Hien, Akwum Onwunta, Nicolas Gillis

Abstract: Deep Nonnegative Matrix Factorization (deep NMF) has recently emerged as a valuable technique for extracting multiple layers of features across different scales. However, all existing deep NMF models and algorithms have primarily centered their evaluation on the least squares error, which may not be the most appropriate metric for assessing the quality of approximations on diverse datasets. For in… ▽ More Deep Nonnegative Matrix Factorization (deep NMF) has recently emerged as a valuable technique for extracting multiple layers of features across different scales. However, all existing deep NMF models and algorithms have primarily centered their evaluation on the least squares error, which may not be the most appropriate metric for assessing the quality of approximations on diverse datasets. For instance, when dealing with data types such as audio signals and documents, it is widely acknowledged that $β$-divergences offer a more suitable alternative. In this paper, we develop new models and algorithms for deep NMF using some $β$-divergences, with a focus on the Kullback-Leibler divergence. Subsequently, we apply these techniques to the extraction of facial features, the identification of topics within document collections, and the identification of materials within hyperspectral images. △ Less

Submitted 18 March, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: 34 pages. We have improved the presentation of the paper, corrected a few typoes, and added the MU for beta=1/2. Accepted in Neural Computation

arXiv:2309.00379 [pdf, other]

Anomaly detection with semi-supervised classification based on risk estimators

Authors: Le Thi Khanh Hien, Sukanya Patra, Souhaib Ben Taieb

Abstract: A significant limitation of one-class classification anomaly detection methods is their reliance on the assumption that unlabeled training data only contains normal instances. To overcome this impractical assumption, we propose two novel classification-based anomaly detection methods. Firstly, we introduce a semi-supervised shallow anomaly detection method based on an unbiased risk estimator. Seco… ▽ More A significant limitation of one-class classification anomaly detection methods is their reliance on the assumption that unlabeled training data only contains normal instances. To overcome this impractical assumption, we propose two novel classification-based anomaly detection methods. Firstly, we introduce a semi-supervised shallow anomaly detection method based on an unbiased risk estimator. Secondly, we present a semi-supervised deep anomaly detection method utilizing a nonnegative (biased) risk estimator. We establish estimation error bounds and excess risk bounds for both risk minimizers. Additionally, we propose techniques to select appropriate regularization parameters that ensure the nonnegativity of the empirical risk in the shallow model under specific loss functions. Our extensive experiments provide strong evidence of the effectiveness of the risk-based anomaly detection methods. △ Less

Submitted 1 September, 2023; originally announced September 2023.

arXiv:2212.11336 [pdf, other]

An inertial ADMM for a class of nonconvex composite optimization with nonlinear coupling constraints

Authors: Le Thi Khanh Hien, Dimitri Papadimitriou

Abstract: In this paper, we propose an inertial alternating direction method of multipliers for solving a class of non-convex multi-block optimization problems with \emph{nonlinear coupling constraints}. Distinctive features of our proposed method, when compared with other alternating direction methods of multipliers for solving non-convex problems with nonlinear coupling constraints, include: (i) we apply… ▽ More In this paper, we propose an inertial alternating direction method of multipliers for solving a class of non-convex multi-block optimization problems with \emph{nonlinear coupling constraints}. Distinctive features of our proposed method, when compared with other alternating direction methods of multipliers for solving non-convex problems with nonlinear coupling constraints, include: (i) we apply the inertial technique to the update of primal variables and (ii) we apply a non-standard update rule for the multiplier by scaling the multiplier by a factor before moving along the ascent direction where a relaxation parameter is allowed. Subsequential convergence and global convergence are presented for the proposed algorithm. △ Less

Submitted 21 February, 2024; v1 submitted 21 December, 2022; originally announced December 2022.

arXiv:2201.07657 [pdf, other]

Multiblock ADMM for nonsmooth nonconvex optimization with nonlinear coupling constraints

Authors: Le Thi Khanh Hien, Dimitri Papadimitriou

Abstract: This paper proposes a multiblock alternating direction method of multipliers for solving a class of multiblock nonsmooth nonconvex optimization problem with nonlinear coupling constraints. We employ a majorization minimization procedure in the update of each block of the primal variables. Subsequential and global convergence of the generated sequence to a critical point of the augmented Lagrangian… ▽ More This paper proposes a multiblock alternating direction method of multipliers for solving a class of multiblock nonsmooth nonconvex optimization problem with nonlinear coupling constraints. We employ a majorization minimization procedure in the update of each block of the primal variables. Subsequential and global convergence of the generated sequence to a critical point of the augmented Lagrangian are proved. We also establish iteration complexity and provide preliminary numerical results for the proposed algorithm. △ Less

Submitted 2 December, 2023; v1 submitted 19 January, 2022; originally announced January 2022.

arXiv:2107.04395 [pdf, other]

doi 10.1137/21M1432661

Block Alternating Bregman Majorization Minimization with Extrapolation

Authors: Le Thi Khanh Hien, Duy Nhat Phan, Nicolas Gillis, Masoud Ahookhosh, Panagiotis Patrinos

Abstract: In this paper, we consider a class of nonsmooth nonconvex optimization problems whose objective is the sum of a block relative smooth function and a proper and lower semicontinuous block separable function. Although the analysis of block proximal gradient (BPG) methods for the class of block $L$-smooth functions have been successfully extended to Bregman BPG methods that deal with the class of blo… ▽ More In this paper, we consider a class of nonsmooth nonconvex optimization problems whose objective is the sum of a block relative smooth function and a proper and lower semicontinuous block separable function. Although the analysis of block proximal gradient (BPG) methods for the class of block $L$-smooth functions have been successfully extended to Bregman BPG methods that deal with the class of block relative smooth functions, accelerated Bregman BPG methods are scarce and challenging to design. Taking our inspiration from Nesterov-type acceleration and the majorization-minimization scheme, we propose a block alternating Bregman Majorization-Minimization framework with Extrapolation (BMME). We prove subsequential convergence of BMME to a first-order stationary point under mild assumptions, and study its global convergence under stronger conditions. We illustrate the effectiveness of BMME on the penalized orthogonal nonnegative matrix factorization problem. △ Less

Submitted 9 July, 2021; originally announced July 2021.

Journal ref: SIAM J. on Mathematics of Data Science 4 (1), pp. 1-25, 2022

arXiv:2102.05433 [pdf, other]

doi 10.1007/s10589-022-00394-8

A Framework of Inertial Alternating Direction Method of Multipliers for Non-Convex Non-Smooth Optimization

Authors: Le Thi Khanh Hien, Duy Nhat Phan, Nicolas Gillis

Abstract: In this paper, we propose an algorithmic framework, dubbed inertial alternating direction methods of multipliers (iADMM), for solving a class of nonconvex nonsmooth multiblock composite optimization problems with linear constraints. Our framework employs the general minimization-majorization (MM) principle to update each block of variables so as to not only unify the convergence analysis of previo… ▽ More In this paper, we propose an algorithmic framework, dubbed inertial alternating direction methods of multipliers (iADMM), for solving a class of nonconvex nonsmooth multiblock composite optimization problems with linear constraints. Our framework employs the general minimization-majorization (MM) principle to update each block of variables so as to not only unify the convergence analysis of previous ADMM that use specific surrogate functions in the MM step, but also lead to new efficient ADMM schemes. To the best of our knowledge, in the nonconvex nonsmooth setting, ADMM used in combination with the MM principle to update each block of variables, and ADMM combined with \emph{inertial terms for the primal variables} have not been studied in the literature. Under standard assumptions, we prove the subsequential convergence and global convergence for the generated sequence of iterates. We illustrate the effectiveness of iADMM on a class of nonconvex low-rank representation problems. △ Less

Submitted 24 June, 2022; v1 submitted 10 February, 2021; originally announced February 2021.

Comments: 35 pages, several parts of the paper clarified, additional experiments on a regularized NMF problem

Journal ref: Computational Optimization and Applications 83, pp. 247-285, 2022

arXiv:2010.12133 [pdf, other]

An Inertial Block Majorization Minimization Framework for Nonsmooth Nonconvex Optimization

Authors: Le Thi Khanh Hien, Duy Nhat Phan, Nicolas Gillis

Abstract: In this paper, we introduce TITAN, a novel inerTIal block majorizaTion minimizAtioN framework for non-smooth non-convex optimization problems. To the best of our knowledge, TITAN is the first framework of block-coordinate update method that relies on the majorization-minimization framework while embedding inertial force to each step of the block updates. The inertial force is obtained via an extra… ▽ More In this paper, we introduce TITAN, a novel inerTIal block majorizaTion minimizAtioN framework for non-smooth non-convex optimization problems. To the best of our knowledge, TITAN is the first framework of block-coordinate update method that relies on the majorization-minimization framework while embedding inertial force to each step of the block updates. The inertial force is obtained via an extrapolation operator that subsumes heavy-ball and Nesterov-type accelerations for block proximal gradient methods as special cases. By choosing various surrogate functions, such as proximal, Lipschitz gradient, Bregman, quadratic, and composite surrogate functions, and by varying the extrapolation operator, TITAN produces a rich set of inertial block-coordinate update methods. We study sub-sequential convergence as well as global convergence for the generated sequence of TITAN. We illustrate the effectiveness of TITAN on two important machine learning problems, namely sparse non-negative matrix factorization and matrix completion. △ Less

Submitted 20 September, 2022; v1 submitted 22 October, 2020; originally announced October 2020.

Comments: 42 pages, we have clarified several aspects of the paper

Journal ref: Journal on Machine Learning Research 24 (18), pp. 1-41, 2023

arXiv:2010.01935 [pdf, other]

doi 10.1007/s10915-021-01504-0

Algorithms for Nonnegative Matrix Factorization with the Kullback-Leibler Divergence

Authors: Le Thi Khanh Hien, Nicolas Gillis

Abstract: Nonnegative matrix factorization (NMF) is a standard linear dimensionality reduction technique for nonnegative data sets. In order to measure the discrepancy between the input data and the low-rank approximation, the Kullback-Leibler (KL) divergence is one of the most widely used objective function for NMF. It corresponds to the maximum likehood estimator when the underlying statistics of the obse… ▽ More Nonnegative matrix factorization (NMF) is a standard linear dimensionality reduction technique for nonnegative data sets. In order to measure the discrepancy between the input data and the low-rank approximation, the Kullback-Leibler (KL) divergence is one of the most widely used objective function for NMF. It corresponds to the maximum likehood estimator when the underlying statistics of the observed data sample follows a Poisson distribution, and KL NMF is particularly meaningful for count data sets, such as documents or images. In this paper, we first collect important properties of the KL objective function that are essential to study the convergence of KL NMF algorithms. Second, together with reviewing existing algorithms for solving KL NMF, we propose three new algorithms that guarantee the non-increasingness of the objective function. We also provide a global convergence guarantee for one of our proposed algorithms. Finally, we conduct extensive numerical experiments to provide a comprehensive picture of the performances of the KL NMF algorithms. △ Less

Submitted 17 April, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

Comments: 31 pages, Accepted in the Journal of Scientific Computing

Journal ref: Journal of Scientific Computing 87, 93, 2021

arXiv:2003.03963 [pdf, ps, other]

A block inertial Bregman proximal algorithm for nonsmooth nonconvex problems with application to symmetric nonnegative matrix tri-factorization

Authors: Masoud Ahookhosh, Le Thi Khanh Hien, Nicolas Gillis, Panagiotis Patrinos

Abstract: We propose BIBPA, a block inertial Bregman proximal algorithm for minimizing the sum of a block relatively smooth function (that is, relatively smooth concerning each block) and block separable nonsmooth nonconvex functions. We prove that the sequence generated by BIBPA subsequentially converges to critical points of the objective under standard assumptions, and globally converges when the objecti… ▽ More We propose BIBPA, a block inertial Bregman proximal algorithm for minimizing the sum of a block relatively smooth function (that is, relatively smooth concerning each block) and block separable nonsmooth nonconvex functions. We prove that the sequence generated by BIBPA subsequentially converges to critical points of the objective under standard assumptions, and globally converges when the objective function is additionally assumed to satisfy the Kurdyka-Łojasiewicz (KŁ) property. We also provide the convergence rate when the objective satisfies the Łojasiewicz inequality. We apply BIBPA to the symmetric nonnegative matrix tri-factorization (SymTriNMF) problem, where we propose kernel functions for SymTriNMF and provide closed-form solutions for subproblems of BIBPA. △ Less

Submitted 8 May, 2020; v1 submitted 9 March, 2020; originally announced March 2020.

Comments: 18 pages

arXiv:2001.04321 [pdf, other]

doi 10.1002/nla.2373

Accelerating Block Coordinate Descent for Nonnegative Tensor Factorization

Authors: Andersen Man Shun Ang, Jeremy E. Cohen, Nicolas Gillis, Le Thi Khanh Hien

Abstract: This paper is concerned with improving the empirical convergence speed of block-coordinate descent algorithms for approximate nonnegative tensor factorization (NTF). We propose an extrapolation strategy in-between block updates, referred to as heuristic extrapolation with restarts (HER). HER significantly accelerates the empirical convergence speed of most existing block-coordinate algorithms for… ▽ More This paper is concerned with improving the empirical convergence speed of block-coordinate descent algorithms for approximate nonnegative tensor factorization (NTF). We propose an extrapolation strategy in-between block updates, referred to as heuristic extrapolation with restarts (HER). HER significantly accelerates the empirical convergence speed of most existing block-coordinate algorithms for dense NTF, in particular for challenging computational scenarios, while requiring a negligible additional computational budget. △ Less

Submitted 20 November, 2020; v1 submitted 13 January, 2020; originally announced January 2020.

Comments: 32 pages, 24 figures

Journal ref: Numerical Linear Algebra with Applications, e2373, 2021

arXiv:1908.01402 [pdf, other]

doi 10.1007/s10589-021-00286-3

Multi-block Bregman proximal alternating linearized minimization and its application to orthogonal nonnegative matrix factorization

Authors: Masoud Ahookhosh, Le Thi Khanh Hien, Nicolas Gillis, Panagiotis Patrinos

Abstract: We introduce and analyze BPALM and A-BPALM, two multi-block proximal alternating linearized minimization algorithms using Bregman distances for solving structured nonconvex problems. The objective function is the sum of a multi-block relatively smooth function (i.e., relatively smooth by fixing all the blocks except one) and block separable (nonsmooth) nonconvex functions. It turns out that the se… ▽ More We introduce and analyze BPALM and A-BPALM, two multi-block proximal alternating linearized minimization algorithms using Bregman distances for solving structured nonconvex problems. The objective function is the sum of a multi-block relatively smooth function (i.e., relatively smooth by fixing all the blocks except one) and block separable (nonsmooth) nonconvex functions. It turns out that the sequences generated by our algorithms are subsequentially convergent to critical points of the objective function, while they are globally convergent under KL inequality assumption. Further, the rate of convergence is further analyzed for functions satisfying the Łojasiewicz's gradient inequality. We apply this framework to orthogonal nonnegative matrix factorization (ONMF) that satisfies all of our assumptions and the related subproblems are solved in closed forms, where some preliminary numerical results is reported. △ Less

Submitted 14 October, 2019; v1 submitted 4 August, 2019; originally announced August 2019.

Journal ref: Computational Optimization and Applications 79, pp. 681-715, 2021

arXiv:1903.01818 [pdf, ps, other]

Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization

Authors: Le Thi Khanh Hien, Nicolas Gillis, Panagiotis Patrinos

Abstract: We propose inertial versions of block coordinate descent methods for solving non-convex non-smooth composite optimization problems. Our methods possess three main advantages compared to current state-of-the-art accelerated first-order methods: (1) they allow using two different extrapolation points to evaluate the gradients and to add the inertial force (we will empirically show that it is more ef… ▽ More We propose inertial versions of block coordinate descent methods for solving non-convex non-smooth composite optimization problems. Our methods possess three main advantages compared to current state-of-the-art accelerated first-order methods: (1) they allow using two different extrapolation points to evaluate the gradients and to add the inertial force (we will empirically show that it is more efficient than using a single extrapolation point), (2) they allow to randomly picking the block of variables to update, and (3) they do not require a restarting step. We prove the subsequential convergence of the generated sequence under mild assumptions, prove the global convergence under some additional assumptions, and provide convergence rates. We deploy the proposed methods to solve non-negative matrix factorization (NMF) and show that they compete favorably with the state-of-the-art NMF algorithms. Additional experiments on non-negative approximate canonical polyadic decomposition, also known as non-negative tensor factorization, are also provided. △ Less

Submitted 1 June, 2020; v1 submitted 5 March, 2019; originally announced March 2019.

arXiv:1901.10757 [pdf, other]

Distributionally Robust and Multi-Objective Nonnegative Matrix Factorization

Authors: Nicolas Gillis, Le Thi Khanh Hien, Valentin Leplat, Vincent Y. F. Tan

Abstract: Nonnegative matrix factorization (NMF) is a linear dimensionality reduction technique for analyzing nonnegative data. A key aspect of NMF is the choice of the objective function that depends on the noise model (or statistics of the noise) assumed on the data. In many applications, the noise model is unknown and difficult to estimate. In this paper, we define a multi-objective NMF (MO-NMF) problem,… ▽ More Nonnegative matrix factorization (NMF) is a linear dimensionality reduction technique for analyzing nonnegative data. A key aspect of NMF is the choice of the objective function that depends on the noise model (or statistics of the noise) assumed on the data. In many applications, the noise model is unknown and difficult to estimate. In this paper, we define a multi-objective NMF (MO-NMF) problem, where several objectives are combined within the same NMF model. We propose to use Lagrange duality to judiciously optimize for a set of weights to be used within the framework of the weighted-sum approach, that is, we minimize a single objective function which is a weighted sum of the all objective functions. We design a simple algorithm based on multiplicative updates to minimize this weighted sum. We show how this can be used to find distributionally robust NMF (DR-NMF) solutions, that is, solutions that minimize the largest error among all objectives, using a dual approach solved via a heuristic inspired from the Frank-Wolfe algorithm. We illustrate the effectiveness of this approach on synthetic, document and audio data sets. The results show that DR-NMF is robust to our incognizance of the noise model of the NMF problem. △ Less

Submitted 9 February, 2021; v1 submitted 30 January, 2019; originally announced January 2019.

Comments: Accepted in IEEE Trans. on Pattern Analysis and Machine Intelligence

arXiv:1802.02563 [pdf, ps, other]

A global linear and local superlinear/quadratic inexact non-interior continuation method for variational inequalities

Authors: Le Thi Khanh Hien, Chek Beng Chua

Abstract: We use the concept of barrier-based smoothing approximations introduced in [ C. B. Chua and Z. Li, A barrier-based smoothing proximal point algorithm for NCPs over closed convex cones, SIOPT 23(2), 2010] to extend the non-interior continuation method proposed in [B. Chen and N. Xiu, A global linear and local quadratic noninterior continuation method for nonlinear complementarity problems based on… ▽ More We use the concept of barrier-based smoothing approximations introduced in [ C. B. Chua and Z. Li, A barrier-based smoothing proximal point algorithm for NCPs over closed convex cones, SIOPT 23(2), 2010] to extend the non-interior continuation method proposed in [B. Chen and N. Xiu, A global linear and local quadratic noninterior continuation method for nonlinear complementarity problems based on Chen-Mangasarian smoothing functions, SIOPT 9(3), 1999] to an inexact non-interior continuation method for variational inequalities over general closed convex sets. Newton equations involved in the method are solved inexactly to deal with high dimension problems. The method is proved to have global linear and local superlinear/quadratic convergence under suitable assumptions. We apply the method to non-negative orthants, positive semidefinite cones, polyhedral sets, epigraphs of matrix operator norm cone and epigraphs of matrix nuclear norm cone. △ Less

Submitted 5 March, 2020; v1 submitted 7 February, 2018; originally announced February 2018.

arXiv:1711.03669 [pdf, other]

An Inexact Primal-Dual Smoothing Framework for Large-Scale Non-Bilinear Saddle Point Problems

Authors: Le Thi Khanh Hien, Renbo Zhao, William B. Haskell

Abstract: We develop an inexact primal-dual first-order smoothing framework to solve a class of non-bilinear saddle point problems with primal strong convexity. Compared with existing methods, our framework yields a significant improvement over the primal oracle complexity, while it has competitive dual oracle complexity. In addition, we consider the situation where the primal-dual coupling term has a large… ▽ More We develop an inexact primal-dual first-order smoothing framework to solve a class of non-bilinear saddle point problems with primal strong convexity. Compared with existing methods, our framework yields a significant improvement over the primal oracle complexity, while it has competitive dual oracle complexity. In addition, we consider the situation where the primal-dual coupling term has a large number of component functions. To efficiently handle this situation, we develop a randomized version of our smoothing framework, which allows the primal and dual sub-problems in each iteration to be inexactly solved by randomized algorithms in expectation. The convergence of this framework is analyzed both in expectation and with high probability. In terms of the primal and dual oracle complexities, this framework significantly improves over its deterministic counterpart. As an important application, we adapt both frameworks for solving convex optimization problems with many functional constraints. To obtain an $\varepsilon$-optimal and $\varepsilon$-feasible solution, both frameworks achieve the best-known oracle complexities. △ Less

Submitted 24 July, 2023; v1 submitted 9 November, 2017; originally announced November 2017.

arXiv:1605.06892 [pdf, other]

Accelerated Randomized Mirror Descent Algorithms For Composite Non-strongly Convex Optimization

Authors: Le Thi Khanh Hien, Cuong V. Nguyen, Huan Xu, Canyi Lu, Jiashi Feng

Abstract: We consider the problem of minimizing the sum of an average function of a large number of smooth convex components and a general, possibly non-differentiable, convex function. Although many methods have been proposed to solve this problem with the assumption that the sum is strongly convex, few methods support the non-strongly convex case. Adding a small quadratic regularization is a common devise… ▽ More We consider the problem of minimizing the sum of an average function of a large number of smooth convex components and a general, possibly non-differentiable, convex function. Although many methods have been proposed to solve this problem with the assumption that the sum is strongly convex, few methods support the non-strongly convex case. Adding a small quadratic regularization is a common devise used to tackle non-strongly convex problems; however, it may cause loss of sparsity of solutions or weaken the performance of the algorithms. Avoiding this devise, we propose an accelerated randomized mirror descent method for solving this problem without the strongly convex assumption. Our method extends the deterministic accelerated proximal gradient methods of Paul Tseng and can be applied even when proximal points are computed inexactly. We also propose a scheme for solving the problem when the component functions are non-smooth. △ Less

Submitted 31 December, 2018; v1 submitted 23 May, 2016; originally announced May 2016.

MSC Class: 65K05; 90C06; 90C30

Showing 1–17 of 17 results for author: Hien, L T K