skip to main content
research-article
Open access

Neural Photo-Finishing

Published: 30 November 2022 Publication History

Abstract

Image processing pipelines are ubiquitous and we rely on them either directly, by filtering or adjusting an image post-capture, or indirectly, as image signal processing (ISP) pipelines on broadly deployed camera systems. Used by artists, photographers, system engineers, and for downstream vision tasks, traditional image processing pipelines feature complex algorithmic branches developed over decades. Recently, image-to-image networks have made great strides in image processing, style transfer, and semantic understanding. The differentiable nature of these networks allows them to fit a large corpus of data; however, they do not allow for intuitive, fine-grained controls that photographers find in modern photo-finishing tools.
This work closes that gap and presents an approach to making complex photo-finishing pipelines differentiable, allowing legacy algorithms to be trained akin to neural networks using first-order optimization methods. By concatenating tailored network proxy models of individual processing steps (e.g. white-balance, tone-mapping, color tuning), we can model a non-differentiable reference image finishing pipeline more faithfully than existing proxy image-to-image network models. We validate the method for several diverse applications, including photo and video style transfer, slider regression for commercial camera ISPs, photography-driven neural demosaicking, and adversarial photo-editing.

References

[1]
Rameen Abdal, Yipeng Qin, and Peter Wonka. 2019. Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?. In IEEE International Conference on Computer Vision (ICCV). 4431--4440.
[2]
Abdelrahman Abdelhamed, Stephen Lin, and Michael S. Brown. 2018. A High-Quality Denoising Dataset for Smartphone Cameras. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1692--1700.
[3]
Adobe. 2022a. Adobe Camera Raw. https://www.adobe.com/products/photoshop/cameraraw.html.
[4]
Adobe. 2022b. Adobe Digital Negative. https://helpx.adobe.com/photoshop/digital-negative.html.
[5]
Adobe. 2022c. Adobe Lightroom. https://adobe.com/products/photoshop-lightroom.html.
[6]
Mahmoud Afifi and Abdullah Abuolaim. 2021. Semi-Supervised Raw-to-Raw Mapping. British Machine Vision Conference (BMVC) (2021).
[7]
Mahmoud Afifi and Michael S. Brown. 2020a. Deep White-Balance Editing. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1394--1403.
[8]
Mahmoud Afifi and Michael S. Brown. 2020b. Interactive White Balancing for Camera-Rendered Images. In IS&T Color and Imaging Conference (CIC). 136--141.
[9]
Mahmoud Afifi, Konstantinos G. Derpanis, Björn Ommer, and Michael S. Brown. 2021. Learning Multi-Scale Photo Exposure Correction. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 9153--9163.
[10]
Mahmoud Afifi, Abhijith Punnappurath, Abdelrahman Abdelhamed, Hakki Can Karaimer, Abdullah Abuolaim, and Michael S. Brown. 2019. Color Temperature Tuning: Allowing Accurate Post-Capture White-Balance Editing. In IS&T Color Imaging Conference (CIC). 1--6.
[11]
Maaz Bin Safeer Ahmad, Jonathan Ragan-Kelley, Alvin Cheung, and Shoaib Kamil. 2019. Automatically Translating Image Processing Libraries to Halide. ACM Transactions on Graphics 38, 6, Article 204 (2019), 13 pages.
[12]
Apple. 2022. Apple ProRAW. https://support.apple.com/en-us/HT211965.
[13]
James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13, 10 (2012), 281--305.
[14]
James Bergstra, Dan Yamins, and David D Cox. 2013. Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms. In Python in Science Conference. 13--20.
[15]
Nicolas Bonneel, Kalyan Sunkavalli, Sylvain Paris, and Hanspeter Pfister. 2013. Example-Based Video Color Grading. ACM Transactions on Graphics 32, 4, Article 39 (2013), 12 pages.
[16]
Vladimir Bychkovsky, Sylvain Paris, Eric Chan, and Frédo Durand. 2011. Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 97--104.
[17]
Chen Chen, Qifeng Chen, Jia Xu, and Vladlen Koltun. 2018. Learning to See in the Dark. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3291--3300.
[18]
Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. 2017b. ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models. ACM Workshop on Artificial Intelligence and Security, 15--26.
[19]
Qifeng Chen, Jia Xu, and Vladlen Koltun. 2017a. Fast Image Processing With Fully-Convolutional Networks. In IEEE International Conference on Computer Vision (ICCV). 2516--2525.
[20]
Marcos V. Conde, Steven McDonagh, Matteo Maggioni, Alevs Leonardis, and Eduardo P��rez-Pellitero. 2022. Model-Based Image Signal Processors via Learnable Dictionaries. In AAAI Conference on Artificial Intelligence (AAAI). 481--489.
[21]
Darktable. 2022. Darktable. https://www.darktable.org.
[22]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 248--255.
[23]
Blackmagic Design. 2022. DaVinci Resolve. blackmagicdesign.com/products/davinciresolve.
[24]
Steven Diamond, Vincent Sitzmann, Frank Julca-Aguilar, Stephen Boyd, Gordon Wetzstein, and Felix Heide. 2021. Dirty Pixels: Towards End-to-End Image Processing and Perception. ACM Transactions on Graphics 40, 3, Article 23 (2021), 15 pages.
[25]
Zheng-Jun Du, Kai-Xiang Lei, Kun Xu, Jianchao Tan, and Yotam Gingold. 2021. Video Recoloring via Spatial-Temporal Geometric Palettes. ACM Transactions on Graphics 40, 4, Article 150 (2021), 16 pages.
[26]
Vincent Dumoulin, Jonathon Shlens, and Manjunath Kudlur. 2017. A Learned Representation for Artistic Style. In International Conference on Learning Representations (ICLR).
[27]
Alexei A. Efros and William T. Freeman. 2001. Image Quilting for Texture Synthesis and Transfer. ACM Transactions on Graphics, 341--346.
[28]
Katrin Eismann, Wayne Palmer, and Dennis Dunbar. 2018. Adobe Photoshop Restoration & Retouching (4th ed.). New Riders.
[29]
Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2016. Image Style Transfer Using Convolutional Neural Networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2414--2423.
[30]
Michaël Gharbi, Gaurav Chaurasia, Sylvain Paris, and Frédo Durand. 2016. Deep Joint Demosaicking and Denoising. ACM Transactions on Graphics 35, 6, Article 191 (2016), 12 pages.
[31]
Michaël Gharbi, Jiawen Chen, Jonathan T. Barron, Samuel W Hasinoff, and Frédo Durand. 2017. Deep Bilateral Learning for Real-Time Image Enhancement. ACM Transactions on Graphics 36, 4, Article 118 (2017), 12 pages.
[32]
Nikolaus Hansen. 2006. The CMA Evolution Strategy: A Comparing Review. Towards a New Evolutionary Computation (2006), 75--102.
[33]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778.
[34]
James Hegarty, John Brunhaver, Zachary DeVito, Jonathan Ragan-Kelley, Noy Cohen, Steven Bell, Artem Vasilyev, Mark Horowitz, and Pat Hanrahan. 2014. Darkroom: Compiling High-level Image Processing Code into Hardware Pipelines. ACM Transactions on Graphics 33, 4, Article 144 (2014), 11 pages.
[35]
Felix Heide, Markus Steinberger, Yun-Ta Tsai, Mushfiqur Rouf, Dawid Pająk, Dikpal Reddy, Orazio Gallo, Jing Liu, Wolfgang Heidrich, Karen Egiazarian, Jan Kautz, and Kari Pulli. 2014. FlexISP: A Flexible Camera Image Processing Framework. ACM Transactions on Graphics 33, 6, Article 231 (2014), 13 pages.
[36]
Aaron Hertzmann, Charles E. Jacobs, Nuria Oliver, Brian Curless, and David H. Salesin. 2001. Image Analogies. ACM Transactions on Graphics, 327--340.
[37]
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. ArXiv:1704.04861 (2017).
[38]
Yuanming Hu, Hao He, Chenxi Xu, Baoyuan Wang, and Stephen Lin. 2018. Exposure: A White-Box Photo Post-Processing Framework. ACM Transactions on Graphics 37, 2, Article 26 (2018), 17 pages.
[39]
Alexis Van Hurkman. 2010. Color Correction Handbook: Professional Techniques for Video and Cinema. Peachpit Press.
[40]
Andrey Ignatov, Luc Van Gool, and Radu Timofte. 2020. Replacing Mobile Camera ISP with a Single Deep Learning Model. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 2275--2285.
[41]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2016. Image-to-Image Translation with Conditional Adversarial Networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5967--5976.
[42]
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In European Conference on Computer Vision (ECCV). 694--711.
[43]
Hakki Can Karaimer and Michael S. Brown. 2016. A Software Platform for Manipulating the Camera Imaging Pipeline. In European Conference on Computer Vision (ECCV). 429--444.
[44]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A Style-Based Generator Architecture for Generative Adversarial Networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4217--4228.
[45]
Diederik Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations (ICLR).
[46]
Tzu-Mao Li, Michaël Gharbi, Andrew Adams, Frédo Durand, and Jonathan Ragan-Kelley. 2018. Differentiable Programming for Image Processing and Deep Learning in Halide. ACM Transactions on Graphics 37, 4, Article 139 (2018), 13 pages.
[47]
Yu-Lun Liu, Wei-Sheng Lai, Yu-Sheng Chen, Yi-Lung Kao, Ming-Hsuan Yang, Yung-Yu Chuang, and Jia-Bin Huang. 2020. Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1648--1657.
[48]
Ruben Martinez-Cantin. 2014. BayesOpt: A Bayesian Optimization Library for Nonlinear Optimization, Experimental Design and Bandits. Journal of Machine Learning Research 15, 115 (2014), 3735--3739.
[49]
M. D. McKay, R. J. Beckman, and W. J. Conover. 1979. A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code. Technometrics 21, 2 (1979), 239--245.
[50]
Ben Mildenhall, Peter Hedman, Ricardo Martin-Brualla, Pratul P. Srinivasan, and Jonathan T. Barron. 2021. NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw Images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 16190--16199.
[51]
Ali Mosleh, Avinash Sharma, Emmanuel Onzon, Fahim Mannan, Nicolas Robidoux, and Felix Heide. 2020. Hardware-in-the-loop End-to-end Optimization of Camera Image Processing Pipelines. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7526--7535.
[52]
Jun Nishimura, Timo Gerasimow, Rao Sushma, Alexsandar Sutic, Chyuan-Tyng Wu, and Gilad Michael. 2018. Automatic ISP Image Quality Tuning Using Nonlinear Optimization. In IEEE International Conference on Image Processing (ICIP). 2471--2475.
[53]
ON Semi MT9P001. 2017. MT9P001: 1/2.5-Inch 5 Mp CMOS Digital Image Sensor. https://www.onsemi.com/pdf/datasheet/mt9p001-d.pdf.
[54]
Emmanuel Onzon, Fahim Mannan, and Felix Heide. 2021. Neural Auto-Exposure for High-Dynamic Range Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7706--7716.
[55]
PicsArt. 2022. PicsArt. https://picsart.com/.
[56]
Rajeev Ramanath, Wesley E. Snyder, Youngjun Yoo, and Mark S. Drew. 2005. Color Image Processing Pipeline. IEEE Signal Processing Magazine 22, 1 (2005), 34--43.
[57]
Ling Shao, Ruomei Yan, Xuelong Li, and Yan Liu. 2014. From Heuristic Optimization to Dictionary Learning: A Review and Comprehensive Comparison of Image Denoising Algorithms. IEEE Transactions on Cybernetics 44, 7 (2014), 1001--1013.
[58]
Zheng Shi, Ethan Tseng, Mario Bijelic, Werner Ritter, and Felix Heide. 2021. ZeroScatter: Domain Transfer for Long Distance Imaging and Vision through Scattering Media. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3475--3485.
[59]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the Inception Architecture for Computer Vision. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2818--2826.
[60]
Ethan Tseng, Felix Yu, Yuting Yang, Fahim Mannan, Karl St. Arnaud, Derek Nowrouzezahrai, Jean-François Lalonde, and Felix Heide. 2019. Hyperparameter Optimization in Black-box Image Processing using Differentiable Proxies. ACM Transactions on Graphics 38, 4, Article 27 (2019), 14 pages.
[61]
Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuan-dong Tian, Peter Vajda, Yangqing Jia, and Kurt Keutzer. 2019. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 10734--10742.
[62]
Günter Wyszecki and W. S. Stiles. 1982. Color Science: Concepts and Methods, Quantitative Data and Formulae (2nd ed.). Wiley.
[63]
Xide Xia, Meng Zhang, Tianfan Xue, Zheng Sun, Hui Fang, Brian Kulis, and Jiawen Chen. 2020. Joint Bilateral Learning for Real-time Universal Photorealistic Style Transfer. In European Conference on Computer Vision (ECCV). 327--342.
[64]
Jaejun Yoo, Youngjung Uh, Sanghyuk Chun, Byeongkyu Kang, and Jung-Woo Ha. 2019. Photorealistic Style Transfer via Wavelet Transforms. In IEEE International Conference on Computer Vision (ICCV). 9035--9044.
[65]
Ke Yu, Zexian Li, Yue Peng, Chen Change Loy, and Jinwei Gu. 2021. ReconfigISP: Reconfigurable Camera Image Processing Pipeline. In IEEE International Conference on Computer Vision (ICCV). 4248--4257.
[66]
Lei Zhang, Xiaolin Wu, Antoni Buades, and Xin Li. 2011. Color Demosaicking by Local Directional Interpolation and Nonlocal Adaptive Thresholding. Journal of Electronic Imaging 20, 2 (2011), 023016.
[67]
Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 586--595.
[68]
Dongzhan Zhou, Xinchi Zhou, Wenwei Zhang, Chen Change Loy, Shuai Yi, Xuesen Zhang, and Wanli Ouyang. 2020. EcoNAS: Finding Proxies for Economical Neural Architecture Search. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 11396--11404.
[69]
Hongpeng Zhou, Minghao Yang, Jun Wang, and Wei Pan. 2019. BayesNAS: A Bayesian Approach for Neural Architecture Search. In International Conference on Machine Learning (ICML). 7603--7613.
[70]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In IEEE International Conference on Computer Vision (ICCV). 2242--2251.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 41, Issue 6
December 2022
1428 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3550454
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 2022
Published in TOG Volume 41, Issue 6

Check for updates

Author Tags

  1. image processing
  2. photo-finishing
  3. raw processing

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)212
  • Downloads (Last 6 weeks)44
Reflects downloads up to 24 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ZeroGrads: Learning Local Surrogates for Non-Differentiable GraphicsACM Transactions on Graphics10.1145/365817343:4(1-15)Online publication date: 19-Jul-2024
  • (2024)Bilateral Guided Radiance Field ProcessingACM Transactions on Graphics10.1145/365814843:4(1-13)Online publication date: 19-Jul-2024
  • (2024)Learning Controllable ISP for Image EnhancementIEEE Transactions on Image Processing10.1109/TIP.2023.330581633(867-880)Online publication date: 1-Jan-2024
  • (2023)One-shot Detail Retouching with Patch Space Neural Transformation BlendingProceedings of the 20th ACM SIGGRAPH European Conference on Visual Media Production10.1145/3626495.3626499(1-10)Online publication date: 30-Nov-2023
  • (2023)Close the Design-to-Manufacturing Gap in Computational Optics with a 'Real2Sim' Learned Two-Photon Neural Lithography SimulatorSIGGRAPH Asia 2023 Conference Papers10.1145/3610548.3618251(1-9)Online publication date: 10-Dec-2023
  • (2023)Efficient Unified Demosaicing for Bayer and Non-Bayer Patterned Image Sensors2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01171(12704-12713)Online publication date: 1-Oct-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media