×

A unifying framework and comparison of algorithms for non-negative matrix factorisation. (English) Zbl 07776777

Summary: Non-negative matrix factorisation (NMF) is an increasingly popular unsupervised learning method. However, parameter estimation in the NMF model is a difficult high-dimensional optimisation problem. We consider algorithms of the alternating least squares type. Solutions to the least squares problem fall in two categories. The first category is iterative algorithms, which include algorithms such as the majorise-minimise (MM) algorithm, coordinate descent, gradient descent and the Févotte-Cemgil expectation-maximisation (FC-EM) algorithm. We introduce a new family of iterative updates based on a generalisation of the FC-EM algorithm. The coordinate descent, gradient descent and FC-EM algorithms are special cases of this new EM family of iterative procedures. Curiously, we show that the MM algorithm is never a member of our general EM algorithm. The second category is based on cone projection. We describe and prove a cone projection algorithm tailored to the non-negative least square problem. We compare the algorithms on a test case and on the problem of identifying mutational signatures in human cancer. We generally find that cone projection is an attractive choice. Furthermore, in the cancer application, we find that a mix-and-match strategy performs better than running each algorithm in isolation.

MSC:

90Cxx Mathematical programming
62-XX Statistics
62Fxx Parametric inference
Full Text: DOI

References:

[1] Alexandrov, L.B., Nik‐Zainal, S., Wedge, D.C., Campbell, P.J.J. & Stratton, M.R.2013. Deciphering signatures of mutational processes operative in human cancer. Cell Reports, 3, 246-259.
[2] Baez‐Ortega, A. & Gori, K.2017. Computational approaches for discovery of mutational signatures in cancer. Briefings in Bioinf., 1, 1-12.
[3] Dempster, A.P., Laird, N.M. & Rubin, D.B.1977. Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc., Ser. B, 39, 1-38. · Zbl 0364.62022
[4] Drton, M.2004. Maximum Likelihood Estimation in Gaussian AMP Chain Graph Models and Gaussian Ancestral Graph Models.PhD. thesis, University of Washington, Department of Statistics.
[5] Févotte, C., Bertin, N. & Durrieu, J.‐L.2009. Nonnegative matrix factorization with the Itakura‐Saito divergence: with applications to music analysis. Neural Comput., 21, 793-830. · Zbl 1156.94306
[6] Févotte, C. & Cemgil, A.T.2009. Nonnegative matrix factorization as probabilistic inference in composite models. In Proc. 17th European Signal Processing Conference (EUSIPCO), pp. 1913-1917: Glasgow, Scotland.
[7] Hunter, D.R. & Lange, K.2004. A tutorial on MM algorithms. The Am. Stat., 58, 30-37.
[8] Lange, K., Chi, E.C. & Zhou, H.2014. A brief survey of modern optimization for statisticians. Int. Stat. Rev., 82, 46-70. · Zbl 1418.90007
[9] Lauritzen, S.1996. Graphical Models.Clarendon Press: Oxford, UK. · Zbl 0907.62001
[10] Lawson, C.L. & Hanson, R.J.1974. Solving Least Squares Problems. Prentice Hall: Englewood Cliffs NJ. · Zbl 0860.65028
[11] Lee, D. & Seung, H.1999. Learning the parts of objects by non‐negative matrix factorization. Nature, 401(6755), 788-791. · Zbl 1369.68285
[12] Lee, D. & Seung, H.2000. Algorithms for non‐negative matrix factorization. In Proceedings of the 13th International Conference on Neural Information Processing Systems, MIT Press Cambridge: Denver, CO.
[13] Meyer, M.C.2013. A simple new algorithm for quadratic programming with applications in statistics. Commun. Stat., Simul. Comput., 42(5), 1126-1139. · Zbl 1347.62013
[14] Mollun, K.M. & vanStokkum, I.H.M.2015. Package nnls. https://cran.r-project.org/web/packages/nnls/nnls.pdf
[15] Nesterov, Y.E.1983. A method of solving a convex programming problem with convergence rate o(1/k^2). Soviet Math. Doklady, 27, 372-376. · Zbl 0535.90071
[16] Nik‐Zainal, S., Alexandrov, L.B., Wedge, D.C. et al. 2012. The Breast Cancer Working Group of International Cancer Genome Consortium. Mutational processes molding the genomes of 21 breast cancers. Cell, 149, 979-993.
[17] Nik‐Zainal, S. & Morganella, S.2017. Mutational signatures in breast cancer: the problem at the dna level. Clin Cancer Res, 23(11), 2617-2629.
[18] O’Donoghue, B. & Candès, E.2015. Adaptive restart for accelerated gradient schemes. Found. Comput. Math., 15, 715-732. · Zbl 1320.90061
[19] Rheinbay, E. & et al. [73 authors] 2017. Discovery and characterization of coding and non‐coding driver mutations in more than 2,500 whole cancer genomes. BioRxiv 237313. https://doi.org/10.1101/237313 · doi:10.1101/237313
[20] Su, W., Boyd, S. & Candès, E.J.2016. A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res., 17, 1-43. · Zbl 1391.90667
[21] Varachan, R. & Roland, C.2008. Simple and globally convergent methods for accelerating the convergence of any EM algorithm. Scand. J. Stat., 35, 335-353. · Zbl 1164.65006
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.