×

Statistical embedding: beyond principal components. (English) Zbl 07792874

Summary: There has been an intense recent activity in embedding of very high-dimensional and nonlinear data structures, much of it in the data science and machine learning literature. We survey this activity in four parts. In the first part, we cover nonlinear methods such as principal curves, multidimensional scaling, local linear methods, ISOMAP, graph-based methods and diffusion mapping, kernel based methods and random projections. The second part is concerned with topological embedding methods, in particular mapping topological properties into persistence diagrams and the Mapper algorithm. Another type of data sets with a tremendous growth is very high-dimensional network data. The task considered in part three is how to embed such data in a vector space of moderate dimension to make the data amenable to traditional techniques such as cluster and classification techniques. Arguably, this is the part where the contrast between algorithmic machine learning methods and statistical modeling, represented by the so-called stochastic block model, is at its greatest. In the paper, we discuss the pros and cons for the two approaches. The final part of the survey deals with embedding in \(\mathbb{R}^2\), that is, visualization. Three methods are presented: \(t\)-SNE, UMAP and LargeVis based on methods in parts one, two and three, respectively. The methods are illustrated and compared on two simulated data sets; one consisting of a triplet of noisy Ranunculoid curves, and one consisting of networks of increasing complexity generated with stochastic block models and with two types of nodes.

MSC:

62-XX Statistics

References:

[1] AIZERMAN, M. A., BRAVERMAN, E. M. and ROZONOER, L. I. (1956). Theoretical foundations of the potential function method in pattern recognition learning. Autom. Remote Control 25 821-137. · Zbl 0151.24701
[2] ARMILLOTTA, M., FOKIANOS, K. and KRIKIDIS, I. (2022). Generalized linear models network autoregression. In Network Science 112-125. International Conference on Network Science.
[3] BAGLAMA, J. and REICHEL, L. (2005). Augmented implicitly restarted Lanczos bidiagonalization methods. SIAM J. Sci. Comput. 27 19-42. Digital Object Identifier: 10.1137/04060593X Google Scholar: Lookup Link MathSciNet: MR2201173 · Zbl 1087.65039 · doi:10.1137/04060593X
[4] BELKIN, M. and NIYOGI, P. (2002). Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in Information Processing Systems (T. K. Leen, T. G. Dietterich and V. Treps, eds.). MIT Press, Cambridge, MA.
[5] Belkin, M. and Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15 1373-1396. · Zbl 1085.68119
[6] BIAN, R., KOH, Y. S., DOBBIE, G. and DIVOLI, A. (2019). Network embedding and change modeling in dynamic heterogeneous networks. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval 861-864.
[7] BICKEL, P. and CHEN, A. (2009). A nonparametric view of network models and Newman-Girvan and other modularities. Proc. Natl. Acad. Sci. 106 21068-21073. · Zbl 1359.62411
[8] Bickel, P. J. and Sarkar, P. (2016). Hypothesis testing for automated community detection in networks. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 253-273. Digital Object Identifier: 10.1111/rssb.12117 Google Scholar: Lookup Link MathSciNet: MR3453655 · Zbl 1411.62162 · doi:10.1111/rssb.12117
[9] BICKEL, P., CHOI, D., CHANG, X. and ZHANG, H. (2013). Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels. Ann. Statist. 41 1922-1943. Digital Object Identifier: 10.1214/13-AOS1124 Google Scholar: Lookup Link MathSciNet: MR3127853 · Zbl 1292.62042 · doi:10.1214/13-AOS1124
[10] BLONDEL, V. D., GUILLAUME, J.-L., LAMBIOTTE, R. and LEFEBVRE, E. (2008). Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008 P10008. · Zbl 1459.91130
[11] BOSER, B. E., GUYON, I. M. and VAPNIK, V. N. (1992). A training algorithm for optimal margin classifiers. In Fifth Annual Workshop on COLT, ACM, Pittsburgh, PA.
[12] BUKKURI, A., ANDOR, N. and DARCY, I. K. (2021). Applications of topological data analysis on oncology. Front. Artif. Intell. Mach. Learn. Artif. Intell. 4 1-14.
[13] CANNINGS, T. I. and SAMWORTH, R. J. (2017). Random-projection ensemble classification. J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 959-1035. Digital Object Identifier: 10.1111/rssb.12228 Google Scholar: Lookup Link MathSciNet: MR3689307 · Zbl 1373.62301 · doi:10.1111/rssb.12228
[14] CARLSSON, G. (2009). Topology and data. Bull. Amer. Math. Soc. (N.S.) 46 255-308. Digital Object Identifier: 10.1090/S0273-0979-09-01249-X Google Scholar: Lookup Link MathSciNet: MR2476414 · Zbl 1172.62002 · doi:10.1090/S0273-0979-09-01249-X
[15] CARRIÈRE, M., MICHEL, B. and OUDOT, S. (2018). Statistical analysis and parameter selection for Mapper. J. Mach. Learn. Res. 19 Paper No. 12, 39 pp. MathSciNet: MR3862419 · Zbl 1444.62172
[16] CARRIÈRE, M. and RABADÁN, R. (2020). Topological data analysis of single-cell Hi-C contact maps. In Topological Data Analysis—The Abel Symposium 2018. Abel Symp. 15 147-162. Springer, Cham. Digital Object Identifier: 10.1007/978-3-030-43408-3_6 Google Scholar: Lookup Link MathSciNet: MR4338672 · Zbl 1448.62213 · doi:10.1007/978-3-030-43408-3_6
[17] CHAZAL, F. and MICHEL, B. (2017). An introduction to topological data analysis: Fundamental and practical aspects for data scientists. Preprint. Available at arXiv:1710.04019v1.
[18] CHAZAL, F. and MICHEL, B. (2021). An introduction to topological data analysis: Fundamental and practical aspects for data scientists. Front. Artif. Intell. Mach. Learn. Artif. Intell. 4 1-28.
[19] CHEN, Y.-C., GENOVESE, C. R. and WASSERMAN, L. (2015). Asymptotic theory for density ridges. Ann. Statist. 43 1896-1928. Digital Object Identifier: 10.1214/15-AOS1329 Google Scholar: Lookup Link MathSciNet: MR3375871 · Zbl 1327.62303 · doi:10.1214/15-AOS1329
[20] CHEN, Y. C., HO, S., FREEMEN, P. E., GENOVESE, C. R. and WASSERMAN, L. (2015a). Cosmic web reconstruction through density ridges: Methods and algorithm. Mon. Not. R. Astron. Soc. 454 1140-1156.
[21] CHEN, Y. C., HO, S., TENNETI, A., MANDELBAUM, R., CROFT, R., DIMATTEO, T., FREEMAN, P. E., GENOVESE, C. R. and WASSERMAN, L. (2015b). Investigating galaxy-filament alignments in hydrodynamic simulations using density ridges. Mon. Not. R. Astron. Soc. 454 3341-3350.
[22] CLAESKENS, G., CROUX, C. and VAN KERCKHOVEN, J. (2008). An information criterion for variable selection in support vector machines. J. Mach. Learn. Res. 9 541-558. Digital Object Identifier: 10.2139/ssrn.1094652 Google Scholar: Lookup Link MathSciNet: MR2417246 zbMATH: 1225.68166 · Zbl 1225.68166 · doi:10.2139/ssrn.1094652
[23] COIFMAN, R. R. and LAFON, S. (2006). Diffusion maps. Appl. Comput. Harmon. Anal. 21 5-30. Digital Object Identifier: 10.1016/j.acha.2006.04.006 Google Scholar: Lookup Link MathSciNet: MR2238665 · Zbl 1095.68094 · doi:10.1016/j.acha.2006.04.006
[24] CORMEN, T. H., LEISERSON, C. E., RIVEST, R. L. and STEIN, C. (2022). Introduction to Algorithms, 3rd ed. MIT Press, Cambridge, MA. MathSciNet: MR2572804 · Zbl 1503.68002
[25] CRANE, H. and DEMPSEY, W. (2015). A framework for statistical network modeling. Preprint. Available at arXiv:1509.08185.
[26] Crawford, L., Monod, A., Chen, A. X., Mukherjee, S. and Rabadán, R. (2020). Predicting clinical outcomes in glioblastoma: An application of topological and functional data analysis. J. Amer. Statist. Assoc. 115 1139-1150. Digital Object Identifier: 10.1080/01621459.2019.1671198 Google Scholar: Lookup Link MathSciNet: MR4143455 · Zbl 1441.62316 · doi:10.1080/01621459.2019.1671198
[27] CUI, P., WANG, X., PEI, J. and ZHU, W. (2019). A survey on network embedding. IEEE Trans. Knowl. Data Eng. 31 833-852.
[28] DE SILVA, V. and TENENBAUM, J. (2002). Global versus local methods in nonlinear dimensionality reduction. Adv. Neural Inf. Process. Syst. 15.
[29] DECELLE, A., KRZAKALA, F., MOORE, C. and ZDEBOROVÁ, L. (2011). Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Phys. Rev. E 84 066106.
[30] DEVROYE, L. and WISE, G. L. (1980). Detection of abnormal behavior via nonparametric estimation of the support. SIAM J. Appl. Math. 38 480-488. Digital Object Identifier: 10.1137/0138038 Google Scholar: Lookup Link MathSciNet: MR0579432 · Zbl 0479.62028 · doi:10.1137/0138038
[31] DONG, Y., CHAWLA, N. V. and SWAMI, A. (2017). Metapath2vec: Scalable representation learning for heterogeneous networks. Kid 17, 2017, Halifax, NS, Canada.
[32] DONG, W., MOSES, C. and LI, K. (2018). Efficient \(k\)-nearest neighbour graph construction for generic similarity measures. In Proceedings of the 20th International Conference of the World Wide Web 577-586, New York.
[33] DU, L., WANG, Y., SONG, G., LU, Z. and WANG, J. (2018). Dynamic network embedding: An extended approach for Skip-Gram based network embedding. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJ(AI-18).
[34] DUCHAMP, T. and STUETZLE, W. (1996). Extremal properties of principal curves in the plane. Ann. Statist. 24 1511-1520. Digital Object Identifier: 10.1214/aos/1032298280 Google Scholar: Lookup Link MathSciNet: MR1416645 · Zbl 0867.62025 · doi:10.1214/aos/1032298280
[35] EDELSBRUNNER, H., LETCHER, D. and ZOMORODIAN, A. (2002). Topological persistence and simplification. Discrete Comput. Geom. 28 511-533. Digital Object Identifier: 10.1007/s00454-002-2885-2 Google Scholar: Lookup Link MathSciNet: MR1949898 · Zbl 1011.68152 · doi:10.1007/s00454-002-2885-2
[36] GENOVESE, C. R., PERONE-PACIFICO, M., VERDINELLI, I. and WASSERMAN, L. (2012). Manifold estimation and singular deconvolution under Hausdorff loss. Ann. Statist. 40 941-963. Digital Object Identifier: 10.1214/12-AOS994 Google Scholar: Lookup Link MathSciNet: MR2985939 · Zbl 1274.62237 · doi:10.1214/12-AOS994
[37] GENOVESE, C. R., PERONE-PACIFICO, M., VERDINELLI, I. and WASSERMAN, L. (2014). Nonparametric ridge estimation. Ann. Statist. 42 1511-1545. Digital Object Identifier: 10.1214/14-AOS1218 Google Scholar: Lookup Link MathSciNet: MR3262459 · Zbl 1310.62045 · doi:10.1214/14-AOS1218
[38] GHOJOGH, B., GHODSI, A., KARRAY, F. and CROWLEY, M. (2021). Johnson-Lindenstrauss lemma, linear and nonlinear random projections, random Fourier features and random kitchen sinks: Tutorial and survey. Preprint. Available at arXiv:2108.04172v1.
[39] GHRIST, R. (2018). Homological algebra and data. In The Mathematics of Data. IAS/Park City Math. Ser. 25 273-325. Amer. Math. Soc., Providence, RI. MathSciNet: MR3839171 · Zbl 1448.68007
[40] Girvan, M. and Newman, M. E. J. (2002). Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99 7821-7826. Digital Object Identifier: 10.1073/pnas.122653799 Google Scholar: Lookup Link MathSciNet: MR1908073 · Zbl 1032.91716 · doi:10.1073/pnas.122653799
[41] GREENE, D. and CUNNINGHAM, P. (2011). Tracking the evolution of communities in dynamic social networks. Report Idiro Technologies, Dublin, Ireland.
[42] GRETTON, A. (2019). Introduction to RKHS, and some simple kernel algorithms. Lecture notes.
[43] GROVER, A. and LESKOVEC, J. (2016). node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 855-864.
[44] Haghverdi, L., Buettner, F. and Theis, F. J. (2015). Diffusion maps for high-dimensional single-cell analysis of differentiation data. Bioinformatics 31 2989-2998. Digital Object Identifier: 10.1093/bioinformatics/btv325 Google Scholar: Lookup Link · doi:10.1093/bioinformatics/btv325
[45] HASTIE, T. (1984). Principal curves and surfaces. Laboratory for Computational Statistics Technical Report 11, Stanford Univ., Dept. Statistics. MathSciNet: MR2634007
[46] HASTIE, T. and STUETZLE, W. (1989). Principal curves. J. Amer. Statist. Assoc. 84 502-516. MathSciNet: MR1010339 · Zbl 0679.62048
[47] HASTIE, T., TIBSHIRANI, R. and FRIEDMAN, J. (2019). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. Springer Series in Statistics. Springer, New York. Digital Object Identifier: 10.1007/978-0-387-84858-7 Google Scholar: Lookup Link MathSciNet: MR2722294 · Zbl 1273.62005 · doi:10.1007/978-0-387-84858-7
[48] HINTON, G. E. and ROWEIS, S. T. (2002). Stochastic neighbour embedding. Adv. Neural Inf. Process. Syst. 15 833-840.
[49] HINTON, G. E. and SALAKHUTDINOV, R. R. (2006). Reducing the dimensionality of data with neural networks. Science 313 504-507. Digital Object Identifier: 10.1126/science.1127647 Google Scholar: Lookup Link MathSciNet: MR2242509 · Zbl 1226.68083 · doi:10.1126/science.1127647
[50] Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97 1090-1098. Digital Object Identifier: 10.1198/016214502388618906 Google Scholar: Lookup Link MathSciNet: MR1951262 · Zbl 1041.62098 · doi:10.1198/016214502388618906
[51] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Soc. Netw. 5 109-137. Digital Object Identifier: 10.1016/0378-8733(83)90021-7 Google Scholar: Lookup Link MathSciNet: MR0718088 · doi:10.1016/0378-8733(83)90021-7
[52] HOTELLING, H. (1933). Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24 417-441. · JFM 59.1182.04
[53] Hotelling, H. (1936). Relations between two sets of variates. Biometrika 28 321-377. · Zbl 0015.40705
[54] HYVÄRINEN, A. and OJA, E. (2000). Independent component analysis: Algorithms and applications. Neural Netw. 13 411-430.
[55] Johnson, W. B. and Lindenstrauss, J. (1984). Extensions of Lipschitz mappings into a Hilbert space. In Conference in Modern Analysis and Probability (New Haven, Conn., 1982). Contemp. Math. 26 189-206. Amer. Math. Soc., Providence, RI. Digital Object Identifier: 10.1090/conm/026/737400 Google Scholar: Lookup Link MathSciNet: MR0737400 · Zbl 0539.46017 · doi:10.1090/conm/026/737400
[56] JOLLIFFE, I. T. (2002). Principal Component Analysis, 2nd ed. Springer Series in Statistics. Springer, New York. MathSciNet: MR2036084 · Zbl 1011.62064
[57] Josse, J. and Husson, F. (2012). Selecting the number of components in principal component analysis using cross-validation approximations. Comput. Statist. Data Anal. 56 1869-1879. Digital Object Identifier: 10.1016/j.csda.2011.11.012 Google Scholar: Lookup Link MathSciNet: MR2892383 · Zbl 1243.62082 · doi:10.1016/j.csda.2011.11.012
[58] KARRER, B. and NEWMAN, M. E. J. (2011). Stochastic blockmodels and community structure in networks. Phys. Rev. E (3) 83 016107, 10 pp. Digital Object Identifier: 10.1103/PhysRevE.83.016107 Google Scholar: Lookup Link MathSciNet: MR2788206 · doi:10.1103/PhysRevE.83.016107
[59] KAZEMI, S. M., GOEL, R., JAIN, K., KOBYZEV, I., SETHI, A., FORSYTH, P. and POUPART, P. (2020). Representation learning for dynamic graphs: A survey. J. Mach. Learn. Res. 21 Paper No. 70, 73 pp. MathSciNet: MR4095349 · Zbl 1498.68243
[60] KIM, J., RINALDO, A. and WASSERMAN, L. (2019). Minimax rates for estimating the dimension of a manifold. J. Comput. Geom. 10 42-95. Digital Object Identifier: 10.20382/jocg.v10i1a3 Google Scholar: Lookup Link MathSciNet: MR3918925 · Zbl 1417.68141 · doi:10.20382/jocg.v10i1a3
[61] KOBOUROV, S. (2012). Spring embedders and forced directed graph drawing algorithms. Preprint. Available at arXiv:1201.3011.
[62] KOHONEN, T. (1982). Self-organized formation of topologically correct feature map. Biol. Cybernet. 43 59-69. · Zbl 0466.92002
[63] KONISHI, S. and KITAGAWA, G. (2008). Information Criteria and Statistical Modeling. Springer Series in Statistics. Springer, New York. Digital Object Identifier: 10.1007/978-0-387-71887-3 Google Scholar: Lookup Link MathSciNet: MR2367855 · Zbl 1172.62003 · doi:10.1007/978-0-387-71887-3
[64] KOSSINETS, G. and WATTS, D. J. (2006). Empirical analysis of an evolving social network. Science 311 88-90. Digital Object Identifier: 10.1126/science.1116869 Google Scholar: Lookup Link MathSciNet: MR2192483 · Zbl 1226.91055 · doi:10.1126/science.1116869
[65] LEE, C. and WILKINSON, D. J. (2019). A review of stochastic block models and extensions for graph clustering. Appl. Netw. Sci. 4 122.
[66] Lei, J. and Rinaldo, A. (2015). Consistency of spectral clustering in stochastic block models. Ann. Statist. 43 215-237. Digital Object Identifier: 10.1214/14-AOS1274 Google Scholar: Lookup Link MathSciNet: MR3285605 · Zbl 1308.62041 · doi:10.1214/14-AOS1274
[67] LEVINA, E. and BICKEL, P. (2004). Maximum likelihood estimation of intrinsic dimension. In Advances in Neural Information Processing Systems (L. Saul, Y. Weiss and L. Bottou, eds.) 17. MIT Press, Cambridge, MA.
[68] LI, P., HASTIE, T. J. and CHURCH, K. W. (2007). Nonlinear estimators and tail bounds for dimension reduction in \(\mathit{l}_1\) using Cauchy random projections. J. Mach. Learn. Res. 8 2497-2532. Digital Object Identifier: 10.1007/978-3-540-72927-3_37 Google Scholar: Lookup Link MathSciNet: MR2353840 · Zbl 1203.68160 · doi:10.1007/978-3-540-72927-3_37
[69] LIM, B. and ZOHREN, S. (2021). Time-series forecasting with deep learning: A survey. Philos. Trans. R. Soc. Lond. A 379 Paper No. 20200209, 14 pp. Digital Object Identifier: 10.1098/rsta.2020.0209 Google Scholar: Lookup Link MathSciNet: MR4236146 · doi:10.1098/rsta.2020.0209
[70] LITTLE, A. V., MAGGIONI, M. and ROSASCO, L. (2011). Multiscale geometric methods for estimating intrinsic dimension. In Proc. SampTA 4:2.
[71] LUDKIN, M., ECKLEY, I. and NEAL, P. (2018). Dynamic stochastic block models: Parameter estimation and detection of changes in community structure. Stat. Comput. 28 1201-1213. Digital Object Identifier: 10.1007/s11222-017-9788-9 Google Scholar: Lookup Link MathSciNet: MR3850391 · Zbl 1430.62137 · doi:10.1007/s11222-017-9788-9
[72] LUNDE, B. Å. S., KLEPPE, T. S. and SKAUG, H. J. (2020). An information criterion for automatic gradient tree boosting. Preprint. Available at arXiv:2008.05926.
[73] MARKOV, A. (1958). The insolubility of the problem of homeomorphy. Dokl. Akad. Nauk SSSR 121 218-220. MathSciNet: MR0097793 · Zbl 0092.00702
[74] MCINNES, L., HEALY, J. and MELVILLE, J. (2018). UMAP: Uniform manifold approximation for dimension reduction. Preprint. Available at arXiv:1802.03426v2.
[75] MIKOLOV, T., SUTSKEVER, I., CHEN, K., CORRADO, G. and DEAN, J. (2013). Distributed representation of words and phrases and their composability. In Advances in Neural Information Processing Systems 26: Proceedings Annual 27th Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, USA.
[76] NEWMAN, M. E. J. (2006). Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103 8577-8582.
[77] NEWMAN, M. (2020). Networks, 2nd ed. Oxford Univ. Press, Oxford. Digital Object Identifier: 10.1093/oso/9780198805090.001.0001 Google Scholar: Lookup Link MathSciNet: MR3838417 · Zbl 1391.94006 · doi:10.1093/oso/9780198805090.001.0001
[78] NEWMAN, M. E. J. and GIRVAN, M. (2004). Finding and evaluating community networks. Phys. Rev. E 69 026113.
[79] NEWMAN, M. E. J. and REINERT, G. (2016). Estimating the number of communities in a network. Phys. Rev. Lett. 137 078301.
[80] NIYOGI, P., SMALE, S. and WEINBERGER, S. (2008). Finding the homology of submanifolds with high confidence from random samples. Discrete Comput. Geom. 39 419-441. Digital Object Identifier: 10.1007/s00454-008-9053-2 Google Scholar: Lookup Link MathSciNet: MR2383768 · Zbl 1148.68048 · doi:10.1007/s00454-008-9053-2
[81] Otneim, H., Jullum, M. and TjØstheim, D. (2020). Pairwise local Fisher and naive Bayes: Improving two standard discriminants. J. Econometrics 216 284-304. Digital Object Identifier: 10.1016/j.jeconom.2020.01.019 Google Scholar: Lookup Link MathSciNet: MR4077395 · Zbl 1456.62062 · doi:10.1016/j.jeconom.2020.01.019
[82] OZERTEM, U. and ERDOGMUS, D. (2011). Locally defined principal curves and surfaces. J. Mach. Learn. Res. 12 1249-1286. MathSciNet: MR2804600 · Zbl 1280.62071
[83] PEARSON, K. (1901). On lines and planes of closest fit to systems of points in space. Philos. Mag. 2 559-572. · JFM 32.0710.04
[84] PEIXITO, T. P. (2021). Descriptive vs. inferential community detection: Pitfalls, myths and half-truths. Preprint. Available at arXiv:2112.00183v1.
[85] PEIXOTO, T. P. (2019). Bayesian stochastic blockmodeling. In Advances in Network Clustering and Blockmodeling 289-332.
[86] PEROZZI, B., AL-RFOU, R. and SKIENA, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 701-710.
[87] QIAO, W. and POLONIK, W. (2021). Algorithms for ridge estimation with convergence guarantees. Preprint. Available at arXiv:2014.12314v1.
[88] QIU, J., DONG, Y., MA, H., LI, J., WANG, K. and TANG, J. (2018). Network embedding as matrix factorization: Unifying DeepWalk, LINE, PTE, and node2vec. In Proceedings WSDM. ACM, New Tork.
[89] QIU, J., DONG, Y., MA, H., LI, J., WANG, K. and TANG, J. (2019). NetSMF: Large-scale network embedding as sparse matrix factorization. In Proceedings of the 2019 World Wide Web Conference, May \(13-17\), San Francisco, CA, USA.
[90] RAVISSHANKER, N. and CHEN, R. (2019). Topological data analysis (TDA) for time series. Preprint. Available at arXiv:1909.10604v1.
[91] Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 1878-1915. Digital Object Identifier: 10.1214/11-AOS887 Google Scholar: Lookup Link MathSciNet: MR2893856 · Zbl 1227.62042 · doi:10.1214/11-AOS887
[92] ROHE, K., QIN, T. and YU, B. (2016). Co-clustering directed graphs to discover asymmetries and directional communities. Proc. Natl. Acad. Sci. USA 113 12679-12684. Digital Object Identifier: 10.1073/pnas.1525793113 Google Scholar: Lookup Link MathSciNet: MR3576189 · Zbl 1406.91306 · doi:10.1073/pnas.1525793113
[93] Roweis, S. T. and Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science 290 2323-2326.
[94] SALINAS, D., FLUNKERT, V., GASTHAUS, J. and JANUSCHOWSKI, T. (2020). DeepAR: Probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. 36 1181-1191.
[95] SAMMON, J. W. (1969). A nonlinear mapping for data structure analysis. IEEE Trans. Comput. 18 403-409.
[96] SCHÖLKOPF, B., SMOLA, A. and MÜLLER, K.-L. (2005). Kernel principal components. Lecture Notes in Comput. Sci. 1327 583-588.
[97] SHAHRIARI, B., SWERSKY, K., WANG, Z., ADAMS, R. P. and DE FREITAS, N. (2015). Taking the human out of the loop: A review of Bayesian optimization. Proc. IEEE 104 148-175.
[98] SINGH, G., MEMOLI, F. and CARLSSON, G. (2007). Topological methods for the analysis of high dimensional data sets and 3D object recognition. In Eurographics Symposium on Point Based Graphics (M. Botsch and R. Pajarola, eds.). The Eurographics Association.
[99] SUN, Y., NORICK, B., HAN, J., YAN, X., YU, P. and YU, X. (2012). Integrating meta-path selection with user-guided object clustering in heterogeneous information networks. In KDD ’12: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1348-1356.
[100] TANG, J., QU, M. and MEI, Q. (2015). PTE: Predictive text embedding through large-scale heterogeneous text networks. Preprint. Available at arXiv:1508.00200v1.
[101] TANG, J., QU, M., WANG, M., ZHANG, M., YAN, J. and MEI, Q. (2015). LINE: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web 1067-1077.
[102] TANG, J., LIU, J., ZHANG, M. and MEI, Q. (2016). Visualizing large-scale and high-dimensional data. In Proceedings of the 25th International Conference on World Wide Web 287-297.
[103] Tenenbaum, J. B., de Silva, V. and Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science 290 2319-2323.
[104] TJØSTHEIM, D., JULLUM, M. and LØLAND, A. (2023). Some recent trends in embedding of time series and dynamic networks. J. Time Ser. Anal. To appear. Digital Object Identifier: 10.1111/jtsa.12677 Google Scholar: Lookup Link · Zbl 07731500 · doi:10.1111/jtsa.12677
[105] TJØSTHEIM, D., JULLUM, M. and LØLAND, A. (2023). Supplement to “Statistical embedding: Beyond principal components”. https://doi.org/10.1214/22-STS881SUPP
[106] TJØSTHEIM, D., OTNEIM, H. and STØVE, B. (2022a). Statistical dependence: Beyond Pearson’s \(ρ\). Statist. Sci. 37 90-109. Digital Object Identifier: 10.1214/21-sts823 Google Scholar: Lookup Link MathSciNet: MR4371097 · Zbl 07474199 · doi:10.1214/21-sts823
[107] TJØSTHEIM, D., OTNEIM, H. and STØVE, B. (2022b). Statistical Modeling Using Local Gaussian Approximation. Elsevier/Academic Press, London. MathSciNet: MR4382419 · Zbl 1504.62011
[108] TORGERSON, W. S. (1952). Multidimensional scaling: I. Theory and method. Psychometrika 17 401-419. Digital Object Identifier: 10.1007/BF02288916 Google Scholar: Lookup Link MathSciNet: MR0054219 · Zbl 0049.37603 · doi:10.1007/BF02288916
[109] TUTTE, W. T. (1963). How to draw a graph. Proc. Lond. Math. Soc. (3) 13 743-767. Digital Object Identifier: 10.1112/plms/s3-13.1.743 Google Scholar: Lookup Link MathSciNet: MR0158387 zbMATH: 0115.40805 · Zbl 0115.40805 · doi:10.1112/plms/s3-13.1.743
[110] VAN DER MAATEN, L. (2014). Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15 3221-3245. MathSciNet: MR3277169 zbMATH: 1319.62134 · Zbl 1319.62134
[111] VAN DER MAATEN, L. and HINTON, G. (2008). Visualizing data using t-SNE. J. Mach. Learn. Res. 9 2579-2605. zbMATH: 1225.68219 · Zbl 1225.68219
[112] VAN DER MAATEN, L., POSTMA, E. and VAN DER HERIK, J. (2009). Dimensionality reduction: A comparative review. Tilburg Centre for Creative Computing, TiCC TR 2009.005.
[113] von Luxburg, U. (2007). A tutorial on spectral clustering. Stat. Comput. 17 395-416. Digital Object Identifier: 10.1007/s11222-007-9033-z Google Scholar: Lookup Link MathSciNet: MR2409803 · doi:10.1007/s11222-007-9033-z
[114] Wang, Y. X. R. and Bickel, P. J. (2017). Likelihood-based model selection for stochastic block models. Ann. Statist. 45 500-528. Digital Object Identifier: 10.1214/16-AOS1457 Google Scholar: Lookup Link MathSciNet: MR3650391 · Zbl 1371.62017 · doi:10.1214/16-AOS1457
[115] WASSERMAN, L. (2018). Topological data analysis. Annu. Rev. Stat. Appl. 5 501-535. Digital Object Identifier: 10.1146/annurev-statistics-031017-100045 Google Scholar: Lookup Link MathSciNet: MR3774757 · doi:10.1146/annurev-statistics-031017-100045
[116] WEI, Y.-C. and CHENG, C.-K. (1989). Towards efficient hierarchical designs by ratio cut partitioning. In 1989 IEEE International Conference on Computer-Aided Design. Digest of Technical Papers 298-301. IEEE.
[117] XIE, H., LI, J. and XUE, H. (2018). A survey of dimensionality reduction techniques based on random projection. Preprint. Available at arXiv:1706.04371v4.
[118] YOUNG, G. and HOUSEHOLDER, A. S. (1938). Discussion of a set of points in terms of their mutual distances. Psychometrika 3 19-22. · JFM 64.1302.04
[119] YOUNG, T., HAZARIKA, D., PORIA, S. and CAMBRIA, E. (2018). Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag. 13 55-75.
[120] ZHANG, J. and CHEN, Y. (2020). Modularity based community detection in heterogeneous networks. Statist. Sinica 30 601-629. MathSciNet: MR4213981 · Zbl 1439.62157
[121] ZHENG, Q. (2016). Spectral techniques for heterogeneous social networks. Ph.D. thesis, Queen’s Univ., Ontario, Canada.
[122] ZHOU, C., LIU, Y., LIU, X. and GAO, J. (2017). Scalable graph embedding for asymmetric proximity. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.
[123] ZHU, X. and PAN, R. (2020). Grouped network vector autoregression. Statist. Sinica 30 1437-1462. Digital Object Identifier: 10.5705/ss.202017.0533 Google Scholar: Lookup Link MathSciNet: MR4257540 · Zbl 1453.62654 · doi:10.5705/ss.202017.0533
[124] Zhu, X., Pan, R., Li, G., Liu, Y. and Wang, H. (2017). Network vector autoregression. Ann. Statist. 45 1096-1123. Digital Object Identifier: 10.1214/16-AOS1476 Google Scholar: Lookup Link MathSciNet: MR3662449 · Zbl 1381.62256 · doi:10.1214/16-AOS1476
[125] ZOMORODIAN, A. and CARLSSON, G. (2005). Computing persistent homology. Discrete Comput. Geom. 33 249-274. Digital Object Identifier: 10.1007/s00454-004-1146-y Google Scholar: Lookup Link MathSciNet: MR2121296 · Zbl 1069.55003 · doi:10.1007/s00454-004-1146-y
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.