×

Learning polynomial transformations via generalized tensor decompositions. (English) Zbl 07844702

Saha, Barna (ed.) et al., Proceedings of the 55th annual ACM SIGACT symposium on theory of computing, STOC ’23, Orlando, FL, USA, June 20–23, 2023. New York, NY: Association for Computing Machinery (ACM). 1671-1684 (2023).

MSC:

68Qxx Theory of computing
Full Text: DOI

References:

[1] Jayadev Acharya, Ashkan Jafarpour, Alon Orlitsky, and Ananda Theertha Suresh. 2014. Near-optimal-sample estimators for spherical gaussian mixtures. arXiv preprint arXiv:1402.4746.
[2] Dimitris Achlioptas and Frank McSherry. 2005. On spectral learning of mixtures of distributions. In International Conference on Computational Learning Theory. 458-469. · Zbl 1137.68512
[3] Zeyuan Allen-Zhu and Yuanzhi Li. 2021. Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions. arXiv preprint arXiv:2106.02619.
[4] Animashree Anandkumar, Rong Ge, Daniel Hsu, Sham M Kakade, and Matus Telgarsky. 2014. Tensor decompositions for learning latent variable models. Journal of machine learning research, 15 (2014), 2773-2832. · Zbl 1319.62109
[5] Joseph Anderson, Mikhail Belkin, Navin Goyal, Luis Rademacher, and James Voss. 2014. The more, the merrier: the blessing of dimensionality for learning large Gaussian mixtures. In Conference on Learning Theory. 1135-1164.
[6] Benny Applebaum. 2016. Cryptographic hardness of random local functions. Computational complexity, 25, 3 (2016), 667-722. · Zbl 1360.94293
[7] Benny Applebaum, Yuval Ishai, and Eyal Kushilevitz. 2006. Cryptography in NC^0. SIAM J. Comput., 36, 4 (2006), 845-888. · Zbl 1126.94014
[8] Sanjeev Arora, Rong Ge, Yingyu Liang, Tengyu Ma, and Yi Zhang. 2017. Generalization and equilibrium in generative adversarial nets (gans). In International Conference on Machine Learning. 224-232.
[9] Sanjeev Arora, Rong Ge, Tengyu Ma, and Ankur Moitra. 2015. Simple, efficient, and neural algorithms for sparse coding. In Conference on learning theory. 113-149.
[10] Sanjeev Arora, Rong Ge, Tengyu Ma, and Andrej Risteski. 2017. Provable learning of noisy-or networks. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing. 1057-1066. · Zbl 1369.68274
[11] Sanjeev Arora, Rong Ge, Ankur Moitra, and Sushant Sachdeva. 2012. Provable ICA with unknown Gaussian noise, with implications for Gaussian mixtures and autoencoders. Advances in Neural Information Processing Systems, 25 (2012). · Zbl 1333.68224
[12] Sanjeev Arora and Ravi Kannan. 2005. Learning mixtures of separated nonspherical Gaussians. The Annals of Applied Probability, 15, 1A (2005), 69-92. · Zbl 1059.62062
[13] Sanjeev Arora, Andrej Risteski, and Yi Zhang. 2018. Do GANs learn the distribution? some theory and empirics. In International Conference on Learning Representations.
[14] Yu Bai, Tengyu Ma, and Andrej Risteski. 2018. Approximability of Discriminators Implies Diversity in GANs. In International Conference on Learning Representations.
[15] Jonas Ballani, Lars Grasedyck, and Melanie Kluge. 2013. Black box approximation of tensors in hierarchical Tucker format. Linear algebra and its applications, 438, 2 (2013), 639-657. · Zbl 1260.65037
[16] Boaz Barak, Jonathan A Kelner, and David Steurer. 2015. Dictionary learning and tensor decomposition via the sum-of-squares method. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing. 143-151. · Zbl 1321.68396
[17] Mikhail Belkin and Kaushik Sinha. 2015. Polynomial learning of distribution families. SIAM J. Comput., 44, 4 (2015), 889-911. · Zbl 1335.68100
[18] Aditya Bhaskara, Moses Charikar, Ankur Moitra, and Aravindan Vijayaraghavan. 2014. Smoothed analysis of tensor decompositions. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing. 594-603.
[19] Aditya Bhaskara, Aidao Chen, Aidan Perreault, and Aravindan Vijayaraghavan. 2019. Smoothed analysis in unsupervised learning via decoupling. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS). 582-610. · Zbl 1487.68190
[20] Arnab Bhattacharyya, Sutanu Gayen, Kuldeep S Meel, and NV Vinodchandran. 2020. Efficient distance approximation for structured high-dimensional distributions via learning. Advances in Neural Information Processing Systems, 33 (2020), 14699-14711.
[21] Arnab Bhattacharyya, Sutanu Gayen, Eric Price, and NV Vinodchandran. 2021. Near-optimal learning of tree-structured distributions by Chow-Liu. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing. 147-160. · Zbl 07707672
[22] Enric Boix-Adsera, Guy Bresler, and Frederic Koehler. 2021. Chow-liu++: Optimal prediction-centric learning of tree ising models. arXiv preprint arXiv:2106.03969. · Zbl 1528.68276
[23] Guy Bresler. 2015. Efficiently learning Ising models on arbitrary graphs. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing. 771-782. · Zbl 1321.68397
[24] Guy Bresler and Mina Karzand. 2020. Learning a tree-structured Ising model in order to make predictions. The Annals of Statistics, 48, 2 (2020), 713-737. · Zbl 1469.62334
[25] Guy Bresler, Elchanan Mossel, and Allan Sly. 2008. Reconstruction of Markov random fields from samples: Some observations and algorithms. In Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques. Springer, 343-356. · Zbl 1159.68636
[26] Johannes Brustle, Yang Cai, and Constantinos Daskalakis. 2020. Multi-item mechanisms without item-independence: Learnability via robustness. In Proceedings of the 21st ACM Conference on Economics and Computation. 715-761.
[27] Minshuo Chen, Wenjing Liao, Hongyuan Zha, and Tuo Zhao. 2020. Statistical guarantees of generative adversarial networks for distribution estimation. arXiv preprint arXiv:2002.03938.
[28] Sitan Chen, Aravind Gollakota, Adam R. Klivans, and Raghu Meka. 2022. Hardness of noise-free learning for two-hidden-layer neural networks. In NeurIPS.
[29] Sitan Chen, Jerry Li, and Yuanzhi Li. 2022. Learning (very) simple generative models is hard. In NeurIPS.
[30] Sitan Chen, Jerry Li, Yuanzhi Li, and Raghu Meka. 2022. Minimax Optimality (Probably) Doesn’t Imply Distribution Learning for GANs. arXiv preprint arXiv:2201.07206.
[31] Ziang Chen, Yingzhou Li, and Jianfeng Lu. 2020. Tensor ring decomposition: optimization landscape and one-loop convergence of alternating least squares. SIAM J. Matrix Anal. Appl., 41, 3 (2020), 1416-1442. · Zbl 1458.65050
[32] CKCN Chow and Cong Liu. 1968. Approximating discrete probability distributions with dependence trees. IEEE transactions on Information Theory, 14, 3 (1968), 462-467. · Zbl 0165.22305
[33] Pierre Comon. 1994. Independent component analysis, a new concept? Signal processing, 36, 3 (1994), 287-314. · Zbl 0791.62004
[34] Pierre Comon and Christian Jutten. 2010. Handbook of Blind Source Separation: Independent component analysis and applications. Academic press.
[35] Sanjoy Dasgupta. 1997. The sample complexity of learning fixed-structure Bayesian networks. Machine Learning, 29, 2 (1997), 165-180. · Zbl 0892.68078
[36] Sanjoy Dasgupta. 1999. Learning mixtures of Gaussians. In 40th Annual Symposium on Foundations of Computer Science (Cat. No. 99CB37039). 634-644.
[37] Sanjoy Dasgupta and Leonard J Schulman. 2007. A probabilistic analysis of EM for mixtures of separated, spherical Gaussians. Journal of Machine Learning Research, 8 (2007), 203-226. · Zbl 1222.62142
[38] Constantinos Daskalakis, Themis Gouleakis, Chistos Tzamos, and Manolis Zampetakis. 2018. Efficient statistics, in high dimensions, from truncated samples. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS). 639-649.
[39] Constantinos Daskalakis, Andrew Ilyas, Vasilis Syrgkanis, and Haoyang Zeng. 2017. Training gans with optimism. arXiv preprint arXiv:1711.00141.
[40] Constantinos Daskalakis and Qinxuan Pan. 2021. Sample-optimal and efficient learning of tree Ising models. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing. 133-146. · Zbl 07765159
[41] Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle. 2000. On the best rank-1 and rank-(r_1, r_2,..., r_n) approximation of higher-order tensors. SIAM journal on Matrix Analysis and Applications, 21, 4 (2000), 1324-1342. · Zbl 0958.15026
[42] Luc Devroye, Abbas Mehrabian, and Tommy Reddad. 2020. The minimax learning rates of normal and Ising undirected graphical models. Electronic Journal of Statistics, 14, 1 (2020), 2338-2361. · Zbl 1445.62069
[43] Ilias Diakonikolas and Daniel M Kane. 2020. Small covers for near-zero sets of polynomials and learning latent variable models. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS). 184-195.
[44] Ilias Diakonikolas, Daniel M Kane, and Alistair Stewart. 2018. List-decodable robust mean estimation and learning mixtures of spherical Gaussians. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. 1047-1060. · Zbl 1428.68272
[45] Ilias Diakonikolas, Daniel M Kane, Alistair Stewart, and Yuxin Sun. 2021. Outlier-robust learning of ising models under dobrushin’s condition. In Conference on Learning Theory. 1645-1682.
[46] William Fedus, Mihaela Rosca, Balaji Lakshminarayanan, Andrew M Dai, Shakir Mohamed, and Ian Goodfellow. 2017. Many paths to equilibrium: GANs do not need to decrease a divergence at every step. arXiv preprint arXiv:1710.08446.
[47] Soheil Feizi, Farzan Farnia, Tony Ginart, and David Tse. 2017. Understanding gans: the lqg setting. arXiv preprint arXiv:1710.10793.
[48] Jon Feldman, Ryan O’Donnell, and Rocco A Servedio. 2008. Learning mixtures of product distributions over discrete domains. SIAM J. Comput., 37, 5 (2008), 1536-1564. · Zbl 1178.68296
[49] Alan Frieze, Mark Jerrum, and Ravi Kannan. 1996. Learning linear transformations. In Proceedings of 37th Conference on Foundations of Computer Science. 359-368.
[50] Rong Ge, Qingqing Huang, and Sham M Kakade. 2015. Learning mixtures of gaussians in high dimensions. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing. 761-770. · Zbl 1321.68403
[51] Rong Ge and Tengyu Ma. 2015. Decomposing overcomplete 3rd order tensors using sum-of-squares algorithms. arXiv preprint arXiv:1504.05287. · Zbl 1386.15048
[52] Gauthier Gidel, Reyhane Askari Hemmat, Mohammad Pezeshki, Rémi Le Priol, Gabriel Huang, Simon Lacoste-Julien, and Ioannis Mitliagkas. 2019. Negative momentum for improved game dynamics. In The 22nd International Conference on Artificial Intelligence and Statistics. 1802-1811.
[53] Surbhi Goel. 2020. Learning ising and potts models with latent variables. In International Conference on Artificial Intelligence and Statistics. 3557-3566.
[54] Oded Goldreich. 2011. Candidate one-way functions based on expander graphs. In Studies in Complexity and Cryptography. Miscellanea on the Interplay between Randomness and Computation. Springer, 76-87. · Zbl 1306.94056
[55] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems, 27 (2014).
[56] Lars Grasedyck. 2010. Hierarchical singular value decomposition of tensors. SIAM journal on matrix analysis and applications, 31, 4 (2010), 2029-2054. · Zbl 1210.65090
[57] F Alberto Grünbaum. 1975. Cubic forms in Gaussian variables. Illinois Journal of Mathematics, 19, 3 (1975), 405-411. · Zbl 0439.60003
[58] Jie Gui, Zhenan Sun, Yonggang Wen, Dacheng Tao, and Jieping Ye. 2021. A review on generative adversarial networks: Algorithms, theory, and applications. IEEE Transactions on Knowledge and Data Engineering.
[59] Moritz Hardt and Eric Price. 2015. Tight bounds for learning a mixture of two gaussians. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing. 753-760. · Zbl 1321.68405
[60] RA HARSHMAN. 1970. Foundations of the PARAFAC procedure: Models and conditions for an“ explanatory” multi-mode factor analysis. UCLA Working Papers in Phonetics, 16 (1970), 1-84.
[61] Klaus-U Höffgen. 1993. Learning and robust learning of product distributions. In Proceedings of the sixth annual conference on Computational learning theory. 77-83.
[62] Samuel Hopkins. 2018. Statistical inference and the sum of squares method. Ph. D. Dissertation. Cornell University.
[63] Samuel B Hopkins and Jerry Li. 2018. Mixture models, robustness, and sum of squares proofs. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. 1021-1034. · Zbl 1428.68240
[64] Samuel B Hopkins, Tselil Schramm, and Jonathan Shi. 2019. A robust spectral algorithm for overcomplete tensor decomposition. In Conference on Learning Theory. 1683-1722.
[65] Samuel B Hopkins, Tselil Schramm, Jonathan Shi, and David Steurer. 2016. Fast spectral algorithms from sum-of-squares proofs: tensor decomposition and planted sparse vectors. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing. 178-191. · Zbl 1377.68199
[66] Daniel Hsu and Sham M Kakade. 2013. Learning mixtures of spherical gaussians: moment methods and spectral decompositions. In Proceedings of the 4th conference on Innovations in Theoretical Computer Science. 11-20. · Zbl 1362.68246
[67] Aapo Hyvärinen, Juha Karhunen, and Erkki Oja. 2002. Independent component analysis. Studies in informatics and control, 11, 2 (2002), 205-207.
[68] Aapo Hyvarinen and Hiroshi Morioka. 2016. Unsupervised feature extraction by time-contrastive learning and nonlinear ica. Advances in Neural Information Processing Systems, 29 (2016).
[69] Aapo Hyvärinen and Erkki Oja. 2000. Independent component analysis: algorithms and applications. Neural networks, 13, 4-5 (2000), 411-430. · Zbl 0893.94034
[70] Aapo Hyvärinen and Petteri Pajunen. 1999. Nonlinear independent component analysis: Existence and uniqueness results. Neural networks, 12, 3 (1999), 429-439.
[71] Aapo Hyvarinen, Hiroaki Sasaki, and Richard Turner. 2019. Nonlinear ICA using auxiliary variables and generalized contrastive learning. In The 22nd International Conference on Artificial Intelligence and Statistics. 859-868.
[72] Vishesh Jain, Frederic Koehler, and Andrej Risteski. 2019. Mean-field approximation, convex hierarchies, and the optimality of correlation rounding: a unified perspective. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing. 1226-1236. · Zbl 1434.68338
[73] Samy Jelassi, Arthur Mensch, Gauthier Gidel, and Yuanzhi Li. 2022. Adam is no better than normalized SGD: Dissecting how adaptivity improves GAN performance. https://openreview.net/forum?id=D9SuLzhgK9
[74] Ilyes Khemakhem, Diederik Kingma, Ricardo Monti, and Aapo Hyvarinen. 2020. Variational autoencoders and nonlinear ica: A unifying framework. In International Conference on Artificial Intelligence and Statistics. 2207-2217.
[75] Yuehaw Khoo, Jianfeng Lu, and Lexing Ying. 2021. Efficient Construction of Tensor Ring Representations from Sampling. Multiscale Modeling & Simulation, 19, 3 (2021), 1261-1284. · Zbl 1468.65014
[76] Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114. · Zbl 1431.68002
[77] Adam Klivans and Raghu Meka. 2017. Learning graphical models using multiplicative weights. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS). 343-354.
[78] Pravesh K Kothari, Jacob Steinhardt, and David Steurer. 2018. Robust moment estimation and improved clustering via sum of squares. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. 1035-1046. · Zbl 1434.62125
[79] Amit Kumar and Ravindran Kannan. 2010. Clustering with spectral norm and the k-means algorithm. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. 299-308.
[80] Joseph M Landsberg and Giorgio Ottaviani. 2013. Equations for secant varieties of Veronese and other varieties. Annali di Matematica Pura ed Applicata, 192, 4 (2013), 569-606. · Zbl 1274.14058
[81] Qi Lei, Jason Lee, Alex Dimakis, and Constantinos Daskalakis. 2020. SGD learns one-layer networks in wgans. In International Conference on Machine Learning. 5799-5808.
[82] Sue E Leurgans, Robert T Ross, and Rebecca B Abel. 1993. A decomposition for three-way arrays. SIAM J. Matrix Anal. Appl., 14, 4 (1993), 1064-1083. · Zbl 0788.65145
[83] Yuanzhi Li and Zehao Dou. 2020. Making Method of Moments Great Again?-How can GANs learn distributions. arXiv preprint arXiv:2003.04033.
[84] Yuanzhi Li and Yingyu Liang. 2017. Provable alternating gradient descent for non-negative matrix factorization with strong correlations. In International Conference on Machine Learning. 2062-2070.
[85] Tengyuan Liang. 2018. How well generative adversarial networks learn distributions. arXiv preprint arXiv:1811.03179. · Zbl 07626743
[86] Tengyu Ma, Jonathan Shi, and David Steurer. 2016. Polynomial-time tensor decompositions with sum-of-squares. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS). 438-446.
[87] Dustin G Mixon, Soledad Villar, and Rachel Ward. 2017. Clustering subgaussian mixtures by semidefinite programming. Information and Inference: A Journal of the IMA, 6, 4 (2017), 389-415. · Zbl 1381.62189
[88] Ankur Moitra and Gregory Valiant. 2010. Settling the polynomial learnability of mixtures of gaussians. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. 93-102. · Zbl 1293.68229
[89] Alexander Novikov, Anton Rodomanov, Anton Osokin, and Dmitry Vetrov. 2014. Putting MRFs on a tensor train. In International Conference on Machine Learning. 811-819.
[90] Ivan Oseledets and Eugene Tyrtyshnikov. 2010. TT-cross approximation for multidimensional arrays. Linear Algebra Appl., 432, 1 (2010), 70-88. · Zbl 1183.65040
[91] Ivan V Oseledets. 2011. Tensor-train decomposition. SIAM Journal on Scientific Computing, 33, 5 (2011), 2295-2317. · Zbl 1232.15018
[92] Stanley Osher, Zuoqiang Shi, and Wei Zhu. 2017. Low dimensional manifold model for image processing. SIAM Journal on Imaging Sciences, 10, 4 (2017), 1669-1690. · Zbl 1401.49042
[93] Oded Regev and Aravindan Vijayaraghavan. 2017. On learning mixtures of well-separated gaussians. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS). 85-96.
[94] Andrej Risteski. 2016. How to calculate partition functions using convex programming hierarchies: provable bounds for variational methods. In Conference on Learning Theory. 1402-1416.
[95] Nicolas Schreuder, Victor-Emmanuel Brunel, and Arnak Dalalyan. 2021. Statistical guarantees for generative models without domination. In Algorithmic Learning Theory. 1051-1071.
[96] Shashank Singh, Ananya Uppal, Boyue Li, Chun-Liang Li, Manzil Zaheer, and Barnabás Póczos. 2018. Nonparametric Density Estimation under Adversarial Losses. In NeurIPS.
[97] Jan Stanczuk, Christian Etmann, Lisa Maria Kreusser, and Carola-Bibiane Schönlieb. 2021. Wasserstein GANs work because they fail (to approximate the Wasserstein distance). arXiv preprint arXiv:2103.01678.
[98] Ju Sun, Qing Qu, and John Wright. 2016. Complete dictionary recovery over the sphere I: Overview and the geometric picture. IEEE Transactions on Information Theory, 63, 2 (2016), 853-884. · Zbl 1364.94164
[99] Ananya Uppal, Shashank Singh, and Barnabas Poczos. 2019. Nonparametric Density Estimation & Convergence Rates for GANs under Besov IPM Losses. Advances in Neural Information Processing Systems, 32 (2019), 9089-9100.
[100] Santosh Vempala and Grant Wang. 2004. A spectral algorithm for learning mixture models. J. Comput. System Sci., 68, 4 (2004), 841-860. · Zbl 1074.68028
[101] Frank Verstraete, Diego Porras, and J Ignacio Cirac. 2004. Density matrix renormalization group and periodic boundary conditions: A quantum information perspective. Physical review letters, 93, 22 (2004), 227205.
[102] Rui Wu, R Srikant, and Jian Ni. 2013. Learning loosely connected Markov random fields. Stochastic Systems, 3, 2 (2013), 362-404. · Zbl 1352.62133
[103] Shanshan Wu, Alexandros G Dimakis, and Sujay Sanghavi. 2019. Learning distributions generated by one-layer ReLU networks. Advances in neural information processing systems, 32 (2019).
[104] Shanshan Wu, Sujay Sanghavi, and Alexandros G Dimakis. 2019. Sparse logistic regression learns all discrete pairwise graphical models. Advances in Neural Information Processing Systems, 32 (2019).
[105] Ke Ye and Lek-Heng Lim. 2018. Tensor network ranks. arXiv preprint arXiv:1801.02662. · Zbl 1454.65030
[106] Anru Zhang and Dong Xia. 2018. Tensor SVD: Statistical and computational limits. IEEE Transactions on Information Theory, 64, 11 (2018), 7311-7338. · Zbl 1432.62176
[107] Qibin Zhao, Guoxu Zhou, Shengli Xie, Liqing Zhang, and Andrzej Cichocki. 2016. Tensor ring decomposition. arXiv preprint arXiv:1606.05535.
[108] Yuchen Zhou, Anru R Zhang, Lili Zheng, and Yazhen Wang. 2022. Optimal high-order tensor svd via tensor-train orthogonal iteration. IEEE Transactions on Information Theory. · Zbl 1505.62486
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.