×

DivGAN: a diversity enforcing generative adversarial network for mode collapse reduction. (English) Zbl 07698050

Summary: Generative Adversarial Networks (GANs) are one of the most efficient generative models to generate data. They have made breakthroughs in many computer vision tasks. However, the generic GAN suffers from a mode collapse problem. To alleviate this problem, the present work proposes a new GAN framework called diversified GAN (DivGAN). DivGAN can be incorporated into any existing GAN. It includes a new network called DivNet that aims at enforcing the GANs to produce diverse data. One advantage of the proposed network is that it does not alter the architectures of the GANs and hence can be easily incorporated into them. Extensive experiments on synthetic and real datasets show that the proposed framework significantly contributes to mode collapse reduction and performs better than recent state-of-the-art GANs.

MSC:

68Txx Artificial intelligence
Full Text: DOI

References:

[1] Arjovsky, M.; Chintala, S.; Bottou, L., Wasserstein generative adversarial networks, (International Conference on Machine Learning, PMLR (2017)), 214-223
[2] Bang, D.; Shim, H., Mggan: solving mode collapse using manifold-guided training, (Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)), 2347-2356
[3] Bromley, J.; Guyon, I.; LeCun, Y.; Säckinger, E.; Shah, R., Signature verification using a “Siamese” time delay neural network, (Advances in Neural Information Processing Systems (1994)), 737
[4] Chen, X.; Duan, Y.; Houthooft, R.; Schulman, J.; Sutskever, I.; Abbeel, P., Infogan: interpretable representation learning by information maximizing generative adversarial nets, (Advances in Neural Information Processing Systems, vol. 29 (2016))
[5] Coates, A.; Ng, A.; Lee, H., An analysis of single-layer networks in unsupervised feature learning, (Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, JMLR Workshop and Conference Proceedings (2011)), 215-223
[6] Deng, J.; Dong, W.; Socher, R.; Li, L. J.; Li, K.; Fei-Fei, L., Imagenet: a large-scale hierarchical image database, (2009 IEEE Conference on Computer Vision and Pattern Recognition (2009), IEEE), 248-255
[7] Ghosh, A.; Kulharia, V.; Namboodiri, V. P.; Torr, P. H.; Dokania, P. K., Multi-agent diverse generative adversarial networks, (Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)), 8513-8521
[8] Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y., Generative adversarial nets, (Advances in Neural Information Processing Systems (2014)), 2672-2680
[9] Hadsell, R.; Chopra, S.; LeCun, Y., Dimensionality reduction by learning an invariant mapping, (2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06) (2006), IEEE), 1735-1742
[10] Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S., Gans trained by a two time-scale update rule converge to a local nash equilibrium, (Advances in Neural Information Processing Systems (2017)), 6626-6637
[11] Hoang, Q.; Nguyen, T. D.; Le, T.; Phung, D., Mgan: training generative adversarial nets with multiple generators, (International Conference on Learning Representations (2018))
[12] Kingma, D. P.; Ba, J. L., Adam: a method for stochastic optimization, (3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings (2015))
[13] Krizhevsky, A.; Hinton, G., Learning Multiple Layers of Features from Tiny Images, 1-58 (2009), Cs.Toronto.Edu
[14] LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P., Gradient-based learning applied to document recognition, Proc. IEEE, 86, 2278-2324 (1998)
[15] Liu, H.; Li, B.; Wu, H.; Liang, H.; Huang, Y.; Li, Y.; Ghanem, B.; Zheng, Y., Combating mode collapse in gans via manifold entropy estimation (2022), arXiv preprint
[16] Metz, L.; Sohl-Dickstein, J.; Poole, B.; Pfau, D., Unrolled generative adversarial networks, (5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings (2017))
[17] Mirza, M.; Osindero, S., Conditional generative adversarial nets (2014), arXiv preprint
[18] Nguyen, T. D.; Le, T.; Vu, H.; Phung, D., Dual discriminator generative adversarial nets, (Advances in Neural Information Processing Systems. Advances in Neural Information Processing Systems, 2017-December (2017))
[19] Nikolenko, S. I., Synthetic data for deep learning (2021), vol. 174
[20] Pan, L.; Cheng, S.; Liu, J.; Tang, P.; Wang, B.; Ren, Y.; Xu, Z., Latent dirichlet allocation based generative adversarial networks, Neural Netw., 132, 461-476 (2020)
[21] Pan, L.; Tang, P.; Chen, Z.; Xu, Z., Contrastive disentanglement in generative adversarial networks (2021), arXiv preprint
[22] Pan, Z.; Niu, L.; Zhang, L., Unigan: reducing mode collapse in gans using a uniform generator, (Advances in Neural Information Processing Systems (2022))
[23] Radford, A.; Metz, L.; Chintala, S., Unsupervised representation learning with deep convolutional generative adversarial networks, (International Conference on Learning Representations, ICLR (2016))
[24] Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X., Improved techniques for training gans, (Advances in Neural Information Processing Systems, vol. 29 (2016)), 2234-2242
[25] Salimans, T.; Goodfellow, I. J.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X., Improved techniques for training gans (2016), CoRR, URL:
[26] Simonyan, K.; Zisserman, A., Very deep convolutional networks for large-scale image recognition, (3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings (2015))
[27] Srivastava, A.; Valkov, L.; Russell, C.; Gutmann, M. U.; Sutton, C., Veegan: reducing mode collapse in gans using implicit variational learning, (Advances in Neural Information Processing Systems (2017)), 3308-3318
[28] Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z., Rethinking the inception architecture for computer vision, (Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)), 2818-2826
[29] Tolstikhin, I. O.; Gelly, S.; Bousquet, O.; Simon-Gabriel, C. J.; Schölkopf, B., Adagan: boosting generative models, (Advances in Neural Information Processing Systems (2017)), 5424-5433
[30] Xiao, H.; Rasul, K.; Vollgraf, R., Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms (2017), arXiv preprint
[31] Yu, S.; Zhang, K.; Xiao, C.; Huang, J. Z.; Li, M. J.; Onizuka, M., Hsgan: reducing mode collapse in gans by the latent code distance of homogeneous samples, Comput. Vis. Image Underst., 214, Article 103314 pp. (2022)
[32] Zhao, H.; Li, T.; Xiao, Y.; Wang, Y., Improving multi-agent generative adversarial nets with variational latent representation, Entropy, 22, 1055 (2020)
[33] Z. Zuo, L. Zhao, H. Zhang, Q. Mo, H. Chen, Z. Wang, a. Li, L. Qiu, W. Xing, D. Lu, Ldmgan: Reducing mode collapse in gans with latent distribution matching, 2019.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.