×

Bigeometric organization of deep nets. (English) Zbl 1390.68520

Summary: In this paper, we build an organization of high-dimensional datasets that cannot be cleanly embedded into a low-dimensional representation due to missing entries and a subset of the features being irrelevant to modeling functions of interest. Our algorithm begins by defining coarse neighborhoods of the points and defining an expected empirical function value on these neighborhoods. We then generate new non-linear features with deep net representations tuned to model the approximate function, and re-organize the geometry of the points with respect to the new representation. Finally, the points are locally z-scored to create an intrinsic geometric organization which is independent of the parameters of the deep net, a geometry designed to assure smoothness with respect to the empirical function. We examine this approach on data from the Center for Medicare and Medicaid Services Hospital Quality Initiative, and generate an intrinsic low-dimensional organization of the hospitals that is smooth with respect to an expert driven function of quality.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
90B70 Theory of organizations, manpower planning in operations research

References:

[1] Belkin, Mikhail; Niyogi, Partha, Laplacian eigenmaps for dimensionality reduction and data representation, IEEE Trans. Neural Comput., 1373-1396 (2003) · Zbl 1085.68119
[2] Bengio, Yoshua, Deep learning of representations: looking forward, (Statistical Language and Speech Processing (2013)) · Zbl 1192.68503
[3] Bengio, Yoshua; Paiement, Jean Francois; Vincent, Pascal, Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering, (Advances in Neural Information Processing Systems (2003), MIT Press), 177-184
[4] Bradley, E. H.; Herrin, J.; Elbel, B., Hospital quality for acute myocardial infarction: correlation among process measures and relationship with short-term mortality, JAMA, 72-78 (2006)
[5] Cho, Y.; Saul, L. K., Kernel methods for deep learning, (Advances in Neural Information Processing Systems (2009))
[6] Coifman, Ronald R.; Gavish, Matan, Harmonic analysis of digital data bases, (Wavelets and Multiscale Analysis (2011)) · Zbl 1250.68096
[7] Coifman, Ronald R.; Lafon, Stéphane, Diffusion maps, Appl. Comput. Harmon. Anal., 21, 1, 5-30 (2006) · Zbl 1095.68094
[8] Center for Medicare and Medicaid Services (CMS), Hospital compare, CMS website: www.medicare.gov/hospitalcompare/; Center for Medicare and Medicaid Services (CMS), Hospital compare, CMS website: www.medicare.gov/hospitalcompare/
[9] Krumholz, H. M.; Lin, Z.; Keenan, P. S., Relationship between hospital readmission and mortality rates for patients hospitalized with acute myocardial infarction, heart failure, or pneumonia, JAMA, 587-593 (2013)
[10] Mahalanobis, Prasanta Chandra, On the generalized distance in statistics, (Proceedings of the National Institute of Sciences (1936)) · Zbl 0015.03302
[11] Montavon, Gregorie; Braun, Mikio; Muller, Klaus-Robert, Kernel analysis of deep networks, J. Mach. Learn. Res., 2563-2581 (2011) · Zbl 1280.68186
[12] Rojas, R., Neural Networks: A Systematic Introduction (1996), Springer Science and Business: Springer Science and Business Media · Zbl 0861.68072
[13] Leeb, Will E.; Coifman, Ronald R., Earth mover’s distance and equivalent metrics for spaces with hierarchical partition trees (2013), Yale CS Technical Report
[14] Saul, L.; Roweis, S., Think globally, fit locally: unsupervised learning of nonlinear manifolds, J. Mach. Learn. Res., 4, 12, 119-155 (2003) · Zbl 1093.68089
[15] Singer, Amit; Coifman, Ronald R., Non-linear independent component analysis with diffusion maps, Appl. Comput. Harmon. Anal., 25, 2, 226-239 (2008) · Zbl 1144.62044
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.