×

DEFT: distilling entangled factors by preventing information diffusion. (English) Zbl 1530.68212

Summary: Disentanglement is a highly desirable property of representation owing to its similarity to human understanding and reasoning. Many works achieve disentanglement upon information bottlenecks. Despite their elegant mathematical foundations, the IB branch usually exhibits lower performance. In order to provide an insight into the problem, we develop an annealing test to calculate the information freezing point (IFP), which is a transition state to freeze information into the latent variables. We also explore this clue or inductive bias for separating the entangled factors according to the differences in the IFP distributions. We found the existing approaches suffer from the information diffusion problem, according to which the increased information diffuses in all latent variables. Based on this insight, we propose a novel disentanglement framework, termed the distilling entangled factor (DEFT), to address the information diffusion problem by scaling backward information. DEFT applies a multistage training strategy, including multigroup encoders with different learning rates and piecewise pressure, to disentangle the factors stage by stage. We evaluate DEFT on three variants of dSprites and SmallNORB, which shows low-variance and high-level disentanglement scores. Furthermore, the experiment under the correlative factors demonstrates incapable of TC-based approaches. DEFT also exhibits a competitive performance in the unsupervised setting.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
62B10 Statistical aspects of information-theoretic topics

References:

[1] Aubry, M., Maturana, D., Efros, A.A., Russell, B.C., & Sivic, J. (2014). Seeing 3d chairs: Exemplar part-based 2d-3d alignment using a large dataset of CAD models. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June 23-28, 2014 (pp 3762-3769). IEEE Computer Society.
[2] Bengio, Y.; Courville, A.; Vincent, P., Representation learning: A review and new perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 8, 1798-1828 (2013) · doi:10.1109/TPAMI.2013.50
[3] Burgess, C.P., Higgins, I., Pal, A., Matthey, L., Watters, N., Desjardins, G., & Lerchner, A. (2017). Understanding disentangling in \(\beta \)-vae. In Workshop on Learning Disentangled Representations at the 31st Conference on Neural Information Processing Systems 2017, NeurIPS 2017, December 4-9, 2017, Long Beach, CA, USA.
[4] Chen, T.Q., Li, X., Grosse, R.B., & Duvenaud, D. (2018). Isolating sources of disentanglement in variational autoencoders. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada (pp 2615-2625).
[5] Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P. (2016). Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain (pp 2172-2180).
[6] Comon, P., Independent component analysis, a new concept?, Signal Processing, 36, 3, 287-314 (1994) · Zbl 0791.62004 · doi:10.1016/0165-1684(94)90029-9
[7] Do, K., & Tran, T. (2020). Theory and evaluation metrics for learning disentangled representations. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. https://openreview.net/
[8] Dupont, E. (2018). Learning disentangled joint continuous and discrete representations. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada (pp 708-718).
[9] Eastwood, C., & Williams, C.K.I. (2018). A framework for the quantitative evaluation of disentangled representations. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30-May 3, 2018, Conference Track Proceedings. https://openreview.net/
[10] Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X., Botvinick, M., Mohamed, S., & Lerchner, A. (2017a). beta-vae: Learning basic visual concepts with a constrained variational framework. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. https://openreview.net/
[11] Higgins, I., Pal, A., Rusu, AA., Matthey, L., Burgess, C., Pritzel, A., Botvinick, M., Blundell, C., & Lerchner, A. (2017b). DARLA: Improving zero-shot transfer in reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, PMLR (Vol. 70, pp 1480-1490).
[12] Higgins, I., Amos, D., Pfau, D., Racanière, S., Matthey, L., Rezende, D.J., & Lerchner, A. (2018a). Towards a definition of disentangled representations. arXiv preprint arXiv:1812.02230
[13] Higgins, I., Sonnerat, N., Matthey, L., Pal, A., Burgess, C.P., Bosnjak, M., Shanahan, M., Botvinick, M., Hassabis, D., & Lerchner, A. (2018b). SCAN: Learning hierarchical compositional visual concepts. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30-May 3, 2018, Conference Track Proceedings. https://openreview.net/
[14] Jeon, I., Lee, W., Pyeon, M., & Kim, G. (2021). Ib-gan: Disengangled representation learning with information bottleneck generative adversarial networks. In Artificial Intelligence/33rd Conference on Innovative Applications of Artificial Intelligence/11th Symposium on Educational Advances in Artificial Intelligence(AAAI), ASSOC Advancement Artificial Intelligence (pp 7926-7934).
[15] Jeong, Y., & Song, HO. (2019). Learning discrete and continuous factors of data via alternating disentanglement. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, PMLR (Vol. 97, pp. 3091-3099).
[16] Kim, H., & Mnih, A. (2018). Disentangling by factorising. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, PMLR (Vol. 80, pp. 2654-2663).
[17] Kingma, D.P., & Welling, M. (2014). Auto-encoding variational bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014. Conference Track Proceedings.
[18] Kumar, A., Sattigeri, P., & Balakrishnan, A. (2018). Variational inference of disentangled latent concepts from unlabeled observations. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30-May 3, 2018, Conference Track Proceedings. https://openreview.net/
[19] Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., & Ranzato, M. (2017). Fader networks: Manipulating images by sliding attributes. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA (pp 5967-5976).
[20] LeCun, Y., Huang, F.J., & Bottou, L. (2004). Learning methods for generic object recognition with invariance to pose and lighting. In 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June-2 July 2004, Washington, DC, USA. IEEE Computer Society.
[21] Locatello, F., Bauer, S., Lucic, M., Rätsch, G., Gelly, S., Schölkopf, B., & Bachem, O. (2019). Challenging common assumptions in the unsupervised learning of disentangled representations. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, PMLR (Vol. 97, pp 4114-4124).
[22] Matthey, L., Higgins, I., Hassabis, D., & Lerchner, A. (2017). dsprites: Disentanglement testing sprites dataset. https://github.com/deepmind/dsprites-dataset/
[23] Ridgeway, K., & Mozer, M.C. (2018). Learning deep disentangled embeddings with the f-statistic loss. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada (pp 185-194).
[24] Schmidhuber, J., Learning factorial codes by predictability minimization, Neural Computation, 4, 6, 863-879 (1992) · doi:10.1162/neco.1992.4.6.863
[25] Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., & Mooij, J.M. (2012). On causal and anticausal learning. In Proceedings of the 29th International Conference on Machine Learning, ICML 2012, Edinburgh, Scotland, UK, June 26-July 1, 2012. https://icml.cc/ · Zbl 1330.68253
[26] Sorrenson, P., Rother, C., & Köthe, U. (2020). Disentanglement by nonlinear ICA with general incompressible-flow networks (GIN). In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. https://openreview.net/
[27] Suter, R., Miladinovic, Đ., Schölkopf, B., & Bauer, S. (2019). Robustly disentangled causal mechanisms: Validating deep representations for interventional robustness. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, PMLR (Vol. 97, pp. 6056-6065).
[28] Tenenbaum, J. (2018). Building machines that learn and think like people. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS 2018, Stockholm, Sweden, July 10-15 (p. 5).
[29] Träuble, F., Creager, E., Kilbertus, N., Locatello, F., Dittadi, A., Goyal, A., Schölkopf, B., Bauer, S. (2021). On disentangled representations learned from correlated data. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, PMLR (Vol. 139, pp. 10401-10412).
[30] Watanabe, S., Information theoretical analysis of multivariate correlation, IBM Journal of Research and Development, 4, 66-82 (1960) · Zbl 0097.35003 · doi:10.1147/rd.41.0066
[31] Wold, S.; Esbensen, K.; Geladi, P., Principal component analysis, Chemometrics and Intelligent Laboratory Systems, 2, 1-3, 37-52 (1987) · doi:10.1016/0169-7439(87)80084-9
[32] Zhu, Y., Xie, J., Liu, B., Elgammal, A. (2019). Learning feature-to-feature translator by alternating back-propagation for generative zero-shot learning. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27-November 2, 2019 (pp. 9843-9853). IEEE.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.