Document Zbl 1461.94016

Lin, Daoyu; Wang, Yang; Xu, Guangluan; Li, Jun; Fu, Kun

Transform a simple sketch to a Chinese painting by a multiscale deep neural network. (English) Zbl 1461.94016

Algorithms (Basel) 11, No. 1, Paper No. 4, 18 p. (2017).

Summary: Recently, inspired by the power of deep learning, convolution neural networks can produce fantastic images at the pixel level. However, a significant limiting factor for previous approaches is that they focus on some simple datasets such as faces and bedrooms. In this paper, we propose a multiscale deep neural network to transform sketches into Chinese paintings. To synthesize more realistic imagery, we train the generative network by using both L1 loss and adversarial loss. Additionally, users can control the process of the synthesis since the generative network is feed-forward. This network can also be treated as neural style transfer by adding an edge detector. Furthermore, additional experiments on image colorization and image super-resolution demonstrate the universality of our proposed approach.

MSC:

94A08	Image processing (compression, reconstruction, etc.) in information and communication theory
68T10	Pattern recognition, speech recognition

Keywords:

deep neural network; sketch; arts synthesis; style transfer

Software:

Sketch2Photo; U-Net; BSDS; Photosketcher; pix2pix; CycleGAN; AlexNet; Attribute2Image; InfoGAN; ImageNet; TensorFlow

Cite Review PDF

Full Text: DOI

OA License

References:

[1]	Jing, Y.; Yang, Y.; Feng, Z.; Ye, J.; Song, M.; Neural Style Transfer: A Review; arXiv: 2017; .
[2]	Gatys, L.A.; Ecker, A.S.; Bethge, M.; A neural algorithm of artistic style; arXiv: 2015; .
[3]	Gatys, L.A.; Ecker, A.S.; Bethge, M.; Image style transfer using convolutional neural networks; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition: ; ,2414-2423.
[4]	Shih, Y.; Paris, S.; Barnes, C.; Freeman, W.T.; Durand, F.; ; Style Transfer for Headshot Portraits: New York, NY, USA 2014; .
[5]	Efros, A.A.; Freeman, W.T.; Image quilting for texture synthesis and transfer; Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques: New York, NY, USA 2001; ,341-346.
[6]	Wei, L.Y.; Levoy, M.; Fast texture synthesis using tree-structured vector quantization; Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques: Boston, MA, USA 2000; ,479-488.
[7]	Krizhevsky, A.; Sutskever, I.; Hinton, G.E.; Imagenet classification with deep convolutional neural networks; Advances in Neural Information Processing Systems, Proceedings of the Neural Information Processing Systems, Lake Tahoe, Nevada, 3-6 December 2012: Red Hook, NY, USA 2012; ,1097-1105.
[8]	LeCun, Y.; Bengio, Y.; Hinton, G.; Deep learning; Nature: 2015; Volume 521 ,436-444.
[9]	Radford, A.; Metz, L.; Chintala, S.; Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks; arXiv: 2015; ,1-15.
[10]	Johnson, J.; Alahi, A.; Fei-Fei, L.; Perceptual losses for real-time style transfer and super-resolution; Proceedings of the European Conference on Computer Vision: Berlin, Germany 2016; ,694-711.
[11]	Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y.; Generative Adversarial Networks; arXiv: 2014; ,1-9.
[12]	Sangkloy, P.; Lu, J.; Fang, C.; Yu, F.; Hays, J.; Scribbler: Controlling Deep Image Synthesis with Sketch and Color; arXiv: 2016; .
[13]	Güçlütürk, Y.; Güçlü, U.; van Lier, R.; van Gerven, M.A.; Convolutional sketch inversion; Proceedings of the European Conference on Computer Vision: Berlin, Germany 2016; ,810-824.
[14]	Long, J.; Shelhamer, E.; Darrell, T.; Fully convolutional networks for semantic segmentation; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition: ; ,3431-3440.
[15]	Eitz, M.; Richter, R.; Hildebrand, K.; Boubekeur, T.; Alexa, M.; Photosketcher: Interactive sketch-based image synthesis; IEEE Comput. Graph. Appl.: 2011; Volume 31 ,56-66.
[16]	Chen, T.; Cheng, M.M.; Tan, P.; Shamir, A.; Hu, S.M.; Sketch2photo: Internet image montage; ACM Trans. Graph.: 2009; Volume 28 .
[17]	Shih, Y.; Paris, S.; Durand, F.; Freeman, W.T.; Data-driven hallucination of different times of day from a single outdoor photo; ACM Trans. Graph.: 2013; Volume 32 .
[18]	Kwatra, V.; Schödl, A.; Essa, I.; Turk, G.; Bobick, A.; Graphcut textures: Image and video synthesis using graph cuts; ACM Trans. Graph.: 2003; Volume 22 ,277-286.
[19]	Kingma, D.P.; Welling, M.; Auto-encoding variational bayes; arXiv: 2013; . · Zbl 1431.68002
[20]	Rezende, D.J.; Mohamed, S.; Wierstra, D.; Stochastic Backpropagation and Approximate Inference in Deep Generative Models; Proceedings of the 31st International Conference on Machine Learning: ; ,1278-1286.
[21]	Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y.; Generative adversarial nets; Advances in Neural Information Processing Systems, Proceedings of the Neural Information Processing Systems, Montreal, QC, Canada, 8-13 December 2014: Red Hook, NY, USA 2014; ,2672-2680.
[22]	Van den Oord, A.; Kalchbrenner, N.; Espeholt, L.; Vinyals, O.; Graves, A.; Conditional image generation with pixelcnn decoders; Advances in Neural Information Processing Systems, Proceedings of the Neural Information Processing Systems, Barcelona, Spain, 5-10 December 2016: Red Hook, NY, USA 2016; ,4790-4798.
[23]	Hertzmann, A.; Jacobs, C.E.; Oliver, N.; Curless, B.; Salesin, D.H.; Image analogies; Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques: New York, NY, USA 2001; ,327-340.
[24]	Yan, X.; Yang, J.; Sohn, K.; Lee, H.; Attribute2image: Conditional image generation from visual attributes; Proceedings of the European Conference on Computer Vision: Berlin, Germany 2016; ,776-791.
[25]	Iizuka, S.; Simo-Serra, E.; Ishikawa, H.; Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification; ACM Trans. Graph.: 2016; Volume 35 ,110:1-110:11.
[26]	Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A.; Going deeper with convolutions; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition: ; ,1-9.
[27]	Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X.; Improved Techniques for Training GANs; Advances in Neural Information Processing Systems, Proceedings of the Neural Information Processing Systems, Barcelona, Spain, 5-10 December 2016: Red Hook, NY, USA 2016; ,1-10.
[28]	Chen, X.; Duan, Y.; Houthooft, R.; Schulman, J.; Sutskever, I.; Abbeel, P.; InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets; arXiv: 2016; .
[29]	Dixon, D.; Prasad, M.; Hammond, T.; iCanDraw: Using sketch recognition and corrective feedback to assist a user in drawing human faces; Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: New York, NY, USA 2010; ,897-906.
[30]	Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A.; Image-to-Image Translation with Conditional Adversarial Networks; arXiv: 2016; .
[31]	Mirza, M.; Osindero, S.; Conditional generative adversarial nets; arXiv: 2014; .
[32]	Ronneberger, O.; Fischer, P.; Brox, T.; U-net: Convolutional networks for biomedical image segmentation; Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention: Berlin, Germany 2015; ,234-241.
[33]	Zeiler, M.D.; Krishnan, D.; Taylor, G.W.; Fergus, R.; Deconvolutional networks; Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR): ; ,2528-2535.
[34]	Odena, A.; Dumoulin, V.; Olah, C.; Deconvolution and Checkerboard Artifacts; 2016; .
[35]	Xie, S.; Tu, Z.; Holistically-nested edge detection; Proceedings of the IEEE International Conference on Computer Vision: ; ,1395-1403.
[36]	Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; Tensorflow: Large-scale machine learning on heterogeneous distributed systems; arXiv: 2016; .
[37]	Ioffe, S.; Szegedy, C.; Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift; Proceedings of the 32nd International Conference on Machine Learning: ; ,448-456.
[38]	Li, C.; Wand, M.; Combining markov random fields and convolutional neural networks for image synthesis; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition: ; ,2479-2486.
[39]	Ulyanov, D.; Lebedev, V.; Vedaldi, A.; Lempitsky, V.; Texture networks: Feed-forward synthesis of textures and stylized images; Proceedings of the 33rd International Conference on Machine Learning: ; .
[40]	Yang, Y.; Newsam, S.; Bag-of-visual-words and spatial extensions for land-use classification; Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems: New York, NY, USA 2010; ,270-279.
[41]	Zhang, R.; Zhu, J.Y.; Isola, P.; Geng, X.; Lin, A.S.; Yu, T.; Efros, A.A.; Real-Time User-Guided Image Colorization with Learned Deep Priors; ACM Trans. Graph.: 2017; Volume 36 .
[42]	Yang, J.; Wright, J.; Huang, T.S.; Ma, Y.; Image super-resolution via sparse representation; IEEE Trans. Image Process.: 2010; Volume 19 ,2861-2873. · Zbl 1371.94429
[43]	Martin, D.; Fowlkes, C.; Tal, D.; Malik, J.; A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics; Proceedings of the Eighth IEEE International Conference on Computer Vision: ; ,416-423.
[44]	Chang, H.; Yeung, D.Y.; Xiong, Y.; Super-resolution through neighbor embedding; Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition: ; .
[45]	Glasner, D.; Bagon, S.; Irani, M.; Super-resolution from a single image; Proceedings of the 2009 IEEE 12th International Conference on Computer Vision: ; ,349-356.
[46]	Zeyde, R.; Elad, M.; Protter, M.; On single image scale-up using sparse-representations; Proceedings of the International Conference on Curves and Surfaces: ; ,711-730. · Zbl 1314.94018
[47]	Bevilacqua, M.; Roumy, A.; Guillemot, C.; Morel, A.; Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding; Proceedings of the 23rd British Machine Vision Conference (BMVC): ; . · Zbl 1374.94045
[48]	Timofte, R.; De Smet, V.; Van Gool, L.; A+: Adjusted anchored neighborhood regression for fast super-resolution; Proceedings of the Asian Conference on Computer Vision: Berlin, Germany 2014; ,111-126.
[49]	Huang, J.B.; Singh, A.; Ahuja, N.; Single image super-resolution from transformed self-exemplars; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition: ; ,5197-5206.
[50]	Dong, C.; Loy, C.C.; He, K.; Tang, X.; Learning a deep convolutional network for image super-resolution; Proceedings of the European Conference on Computer Vision: Berlin, Germany 2014; ,184-199.

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.