Skip to main content

Showing 1–25 of 25 results for author: Patashnik, O

  1. arXiv:2409.15273  [pdf, other

    cs.CV

    MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors

    Authors: Yehonathan Litman, Or Patashnik, Kangle Deng, Aviral Agrawal, Rushikesh Zawar, Fernando De la Torre, Shubham Tulsiani

    Abstract: Recent works in inverse rendering have shown promise in using multi-view images of an object to recover shape, albedo, and materials. However, the recovered components often fail to render accurately under new lighting conditions due to the intrinsic challenge of disentangling albedo and material properties from input images. To address this challenge, we introduce MaterialFusion, an enhanced conv… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

    Comments: Project Page: https://yehonathanlitman.github.io/material_fusion

  2. arXiv:2408.00735  [pdf, other

    cs.CV cs.GR

    TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models

    Authors: Gilad Deutch, Rinon Gal, Daniel Garibi, Or Patashnik, Daniel Cohen-Or

    Abstract: Diffusion models have opened the path to a wide range of text-based image editing frameworks. However, these typically build on the multi-step nature of the diffusion backwards process, and adapting them to distilled, fast-sampling methods has proven surprisingly challenging. Here, we focus on a popular line of text-based editing frameworks - the ``edit-friendly'' DDPM-noise inversion approach. We… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: Project page: https://turboedit-paper.github.io/

  3. arXiv:2404.03620  [pdf, other

    cs.CV cs.GR

    LCM-Lookahead for Encoder-based Text-to-Image Personalization

    Authors: Rinon Gal, Or Lichter, Elad Richardson, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or

    Abstract: Recent advancements in diffusion models have introduced fast sampling methods that can effectively produce high-quality images in just one or a few denoising steps. Interestingly, when these are distilled from existing diffusion models, they often maintain alignment with the original model, retaining similar outputs for similar prompts and seeds. These properties present opportunities to leverage… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Project page at https://lcm-lookahead.github.io/

  4. arXiv:2403.16990  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation

    Authors: Omer Dahary, Or Patashnik, Kfir Aberman, Daniel Cohen-Or

    Abstract: Text-to-image diffusion models have an unprecedented ability to generate diverse and high-quality images. However, they often struggle to faithfully capture the intended semantics of complex input prompts that include multiple subjects. Recently, numerous layout-to-image extensions have been introduced to improve user control, aiming to localize subjects represented by specific tokens. Yet, these… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Project page: https://omer11a.github.io/bounded-attention/

  5. arXiv:2403.14602  [pdf, other

    cs.CV cs.GR cs.LG eess.IV

    ReNoise: Real Image Inversion Through Iterative Noising

    Authors: Daniel Garibi, Or Patashnik, Andrey Voynov, Hadar Averbuch-Elor, Daniel Cohen-Or

    Abstract: Recent advancements in text-guided diffusion models have unlocked powerful image manipulation capabilities. However, applying these methods to real images necessitates the inversion of the images into the domain of the pretrained diffusion model. Achieving faithful inversion remains a challenge, particularly for more recent models trained to generate images with a small number of denoising steps.… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: project page at: https://garibida.github.io/ReNoise-Inversion/

  6. arXiv:2402.14792  [pdf, other

    cs.CV cs.GR cs.LG

    Consolidating Attention Features for Multi-view Image Editing

    Authors: Or Patashnik, Rinon Gal, Daniel Cohen-Or, Jun-Yan Zhu, Fernando De la Torre

    Abstract: Large-scale text-to-image models enable a wide range of image editing techniques, using text prompts or even spatial controls. However, applying these editing methods to multi-view images depicting a single scene leads to 3D-inconsistent results. In this work, we focus on spatial control-based geometric manipulations and introduce a method to consolidate the editing process across various views. W… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: Project Page at https://qnerf-consolidation.github.io/qnerf-consolidation/

  7. arXiv:2311.17083  [pdf, other

    cs.CV

    CLiC: Concept Learning in Context

    Authors: Mehdi Safaee, Aryan Mikaeili, Or Patashnik, Daniel Cohen-Or, Ali Mahdavi-Amiri

    Abstract: This paper addresses the challenge of learning a local visual pattern of an object from one image, and generating images depicting objects with that pattern. Learning a localized concept and placing it on an object in a target image is a nontrivial task, as the objects may have different orientations and shapes. Our approach builds upon recent advancements in visual concept learning. It involves a… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  8. arXiv:2311.03335  [pdf, other

    cs.CV cs.GR

    Cross-Image Attention for Zero-Shot Appearance Transfer

    Authors: Yuval Alaluf, Daniel Garibi, Or Patashnik, Hadar Averbuch-Elor, Daniel Cohen-Or

    Abstract: Recent advancements in text-to-image generative models have demonstrated a remarkable ability to capture a deep semantic understanding of images. In this work, we leverage this semantic knowledge to transfer the visual appearance between objects that share similar semantics but may differ significantly in shape. To achieve this, we build upon the self-attention layers of these generative models an… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: Project page: https://garibida.github.io/cross-image-attention

  9. arXiv:2310.17590  [pdf, other

    cs.CV

    Noise-Free Score Distillation

    Authors: Oren Katzir, Or Patashnik, Daniel Cohen-Or, Dani Lischinski

    Abstract: Score Distillation Sampling (SDS) has emerged as the de facto approach for text-to-content generation in non-image domains. In this paper, we reexamine the SDS process and introduce a straightforward interpretation that demystifies the necessity for large Classifier-Free Guidance (CFG) scales, rooted in the distillation of an undesired noise term. Building upon our interpretation, we propose a nov… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Project page at https://orenkatzir.github.io/nfsd/

  10. arXiv:2303.11306  [pdf, other

    cs.CV cs.GR cs.LG

    Localizing Object-level Shape Variations with Text-to-Image Diffusion Models

    Authors: Or Patashnik, Daniel Garibi, Idan Azuri, Hadar Averbuch-Elor, Daniel Cohen-Or

    Abstract: Text-to-image models give rise to workflows which often begin with an exploration step, where users sift through a large collection of generated images. The global nature of the text-to-image generation process prevents users from narrowing their exploration to a particular object in the image. In this paper, we present a technique to generate a collection of images that depicts variations in the… ▽ More

    Submitted 12 August, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: ICCV 2023. Project page at https://orpatashnik.github.io/local-prompt-mixing/

  11. arXiv:2211.07600  [pdf, other

    cs.CV cs.GR

    Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures

    Authors: Gal Metzer, Elad Richardson, Or Patashnik, Raja Giryes, Daniel Cohen-Or

    Abstract: Text-guided image generation has progressed rapidly in recent years, inspiring major breakthroughs in text-guided shape generation. Recently, it has been shown that using score distillation, one can successfully text-guide a NeRF model to generate a 3D object. We adapt the score distillation to the publicly available, and computationally efficient, Latent Diffusion Models, which apply the entire d… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

  12. arXiv:2208.01618  [pdf, other

    cs.CV cs.CL cs.GR cs.LG

    An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

    Authors: Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or

    Abstract: Text-to-image models offer unprecedented freedom to guide creation through natural language. Yet, it is unclear how such freedom can be exercised to generate images of specific unique concepts, modify their appearance, or compose them in new roles and novel scenes. In other words, we ask: how can we use language-guided models to turn our cat into a painting, or imagine a new product based on our f… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    Comments: Project page: https://textual-inversion.github.io

  13. arXiv:2202.14020  [pdf, other

    cs.CV cs.GR cs.LG

    State-of-the-Art in the Architecture, Methods and Applications of StyleGAN

    Authors: Amit H. Bermano, Rinon Gal, Yuval Alaluf, Ron Mokady, Yotam Nitzan, Omer Tov, Or Patashnik, Daniel Cohen-Or

    Abstract: Generative Adversarial Networks (GANs) have established themselves as a prevalent approach to image synthesis. Of these, StyleGAN offers a fascinating case study, owing to its remarkable visual quality and an ability to support a large array of downstream tasks. This state-of-the-art report covers the StyleGAN architecture, and the ways it has been employed since its conception, while also analyzi… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

  14. arXiv:2202.02713  [pdf, other

    cs.CV

    FEAT: Face Editing with Attention

    Authors: Xianxu Hou, Linlin Shen, Or Patashnik, Daniel Cohen-Or, Hui Huang

    Abstract: Employing the latent space of pretrained generators has recently been shown to be an effective means for GAN-based face manipulation. The success of this approach heavily relies on the innate disentanglement of the latent space axes of the generator. However, face manipulation often intends to affect local regions only, while common generators do not tend to have the necessary spatial disentanglem… ▽ More

    Submitted 6 February, 2022; originally announced February 2022.

  15. arXiv:2201.13433  [pdf, other

    cs.CV

    Third Time's the Charm? Image and Video Editing with StyleGAN3

    Authors: Yuval Alaluf, Or Patashnik, Zongze Wu, Asif Zamir, Eli Shechtman, Dani Lischinski, Daniel Cohen-Or

    Abstract: StyleGAN is arguably one of the most intriguing and well-studied generative models, demonstrating impressive performance in image generation, inversion, and manipulation. In this work, we explore the recent StyleGAN3 architecture, compare it to its predecessor, and investigate its unique advantages, as well as drawbacks. In particular, we demonstrate that while StyleGAN3 can be trained on unaligne… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

    Comments: Project page available at https://yuval-alaluf.github.io/stylegan3-editing/

  16. arXiv:2108.00946  [pdf, other

    cs.CV cs.CL cs.GR cs.LG

    StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

    Authors: Rinon Gal, Or Patashnik, Haggai Maron, Gal Chechik, Daniel Cohen-Or

    Abstract: Can a generative model be trained to produce images from a specific domain, guided by a text prompt only, without seeing any image? In other words: can an image generator be trained "blindly"? Leveraging the semantic power of large scale Contrastive-Language-Image-Pre-training (CLIP) models, we present a text-driven method that allows shifting a generative model to new domains, without having to c… ▽ More

    Submitted 16 December, 2021; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: Project page: https://stylegan-nada.github.io/

  17. arXiv:2107.07437  [pdf, other

    cs.CV

    StyleFusion: A Generative Model for Disentangling Spatial Segments

    Authors: Omer Kafri, Or Patashnik, Yuval Alaluf, Daniel Cohen-Or

    Abstract: We present StyleFusion, a new mapping architecture for StyleGAN, which takes as input a number of latent codes and fuses them into a single style code. Inserting the resulting style code into a pre-trained StyleGAN generator results in a single harmonized image in which each semantic region is controlled by one of the input latent codes. Effectively, StyleFusion yields a disentangled representatio… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

    Comments: Code is available at: https://github.com/OmerKafri/StyleFusion

  18. arXiv:2104.02699  [pdf, other

    cs.CV

    ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement

    Authors: Yuval Alaluf, Or Patashnik, Daniel Cohen-Or

    Abstract: Recently, the power of unconditional image synthesis has significantly advanced through the use of Generative Adversarial Networks (GANs). The task of inverting an image into its corresponding latent code of the trained GAN is of utmost importance as it allows for the manipulation of real images, leveraging the rich semantics learned by the network. Recognizing the limitations of current inversion… ▽ More

    Submitted 24 August, 2021; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: Accepted to ICCV 2021; Project page available at https://yuval-alaluf.github.io/restyle-encoder/

  19. arXiv:2103.17249  [pdf, other

    cs.CV cs.CL cs.GR cs.LG

    StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery

    Authors: Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, Dani Lischinski

    Abstract: Inspired by the ability of StyleGAN to generate highly realistic images in a variety of domains, much recent work has focused on understanding how to use the latent spaces of StyleGAN to manipulate generated and real images. However, discovering semantically meaningful latent manipulations typically involves painstaking human examination of the many degrees of freedom, or an annotated collection o… ▽ More

    Submitted 31 March, 2021; originally announced March 2021.

    Comments: 18 pages, 24 figures, code and video may be found here: https://github.com/orpatashnik/StyleCLIP

  20. arXiv:2102.02766  [pdf, other

    cs.CV

    Designing an Encoder for StyleGAN Image Manipulation

    Authors: Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, Daniel Cohen-Or

    Abstract: Recently, there has been a surge of diverse methods for performing image editing by employing pre-trained unconditional generators. Applying these methods on real images, however, remains a challenge, as it necessarily requires the inversion of the images into their latent space. To successfully invert a real image, one needs to find a latent code that reconstructs the input image accurately, and… ▽ More

    Submitted 4 February, 2021; originally announced February 2021.

  21. arXiv:2102.02754  [pdf, other

    cs.CV

    Only a Matter of Style: Age Transformation Using a Style-Based Regression Model

    Authors: Yuval Alaluf, Or Patashnik, Daniel Cohen-Or

    Abstract: The task of age transformation illustrates the change of an individual's appearance over time. Accurately modeling this complex transformation over an input facial image is extremely challenging as it requires making convincing, possibly large changes to facial features and head shape, while still preserving the input identity. In this work, we present an image-to-image translation method that lea… ▽ More

    Submitted 18 May, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

    Comments: Accepted to SIGGRAPH 2021, project page available at https://yuval-alaluf.github.io/SAM/

  22. arXiv:2010.02036  [pdf, other

    cs.CV

    BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer

    Authors: Or Patashnik, Dov Danon, Hao Zhang, Daniel Cohen-Or

    Abstract: State-of-the-art image-to-image translation methods tend to struggle in an imbalanced domain setting, where one image domain lacks richness and diversity. We introduce a new unsupervised translation network, BalaGAN, specifically designed to tackle the domain imbalance problem. We leverage the latent modalities of the richer domain to turn the image-to-image translation problem, between two imbala… ▽ More

    Submitted 5 June, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2021

  23. arXiv:2008.00951  [pdf, other

    cs.CV

    Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation

    Authors: Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, Daniel Cohen-Or

    Abstract: We present a generic image-to-image translation framework, pixel2style2pixel (pSp). Our pSp framework is based on a novel encoder network that directly generates a series of style vectors which are fed into a pretrained StyleGAN generator, forming the extended W+ latent space. We first show that our encoder can directly embed real images into W+, with no additional optimization. Next, we propose u… ▽ More

    Submitted 21 April, 2021; v1 submitted 3 August, 2020; originally announced August 2020.

    Comments: Accepted to CVPR 2021, project page available at https://eladrich.github.io/pixel2style2pixel/

  24. arXiv:math/9511224  [pdf, ps, other

    math.CO

    Asymptotically optimal covering designs

    Authors: Daniel Gordon, Greg Kuperberg, Oren Patashnik, Joel Spencer

    Abstract: A (v,k,t) covering design, or covering, is a family of k-subsets, called blocks, chosen from a v-set, such that each t-subset is contained in at least one of the blocks. The number of blocks is the covering's size}, and the minimum size of such a covering is denoted by C(v,k,t). It is easy to see that a covering must contain at least (v choose t)/(k choose t) blocks, and in 1985 Rödl [European J… ▽ More

    Submitted 9 November, 1995; originally announced November 1995.

    Report number: Kuperberg migration 5/2002

    Journal ref: J. Combin. Theory Ser. A 75 (1996), no. 2, 270--280

  25. arXiv:math/9502238  [pdf, ps, other

    math.CO

    New constructions for covering designs

    Authors: Daniel Gordon, Greg Kuperberg, Oren Patashnik

    Abstract: A $(v,k,t)$ {\em covering design}, or {\em covering}, is a family of $k$-subsets, called blocks, chosen from a $v$-set, such that each $t$-subset is contained in at least one of the blocks. The number of blocks is the covering's {\em size}, and the minimum size of such a covering is denoted by $C(v,k,t)$. This paper gives three new methods for constructing good coverings: a greedy algorithm simi… ▽ More

    Submitted 15 February, 1995; originally announced February 1995.

    Report number: Kuperberg migration 11/2004

    Journal ref: J. Combin. Des. 4 (1995), no. 4, 269-284