Skip to main content

SDEdit: Image Synthesis and Editing with Stochastic Differential Equations

SDEdit is an image synthesis and editing framework based on stochastic differential equations (SDEs). SDEdit allows stroke-based image synthesis, stroke-based image editing and image compositing without task specific optimization.
Created on August 13|Last edited on October 25

Contents

1. Introduction

Before we dig into this topic, I suggest readers to check out my previous report on Score Based Generative Models. This report assumes a basic understanding of score based and denoising generative modeling techniques. Please refer to this report below for a better understanding of score-based generative modeling theory.

Feel free to have a look at the original paper by the Ermon group at Stanford.
For this report, we will only focus on the stroke-based image editing task as it is relevant to our image editing series. Please checkout my previous report on image editing using PTI, below.


2. Working of Stroke Based Image Editing (🤓 to 😎)

A sketch on the transitions observed during stochastic differential editing.
Animation by Sayantan Das
Working of Stroke Based Image Editing. Animation by Sayantan Das
From score-based generative modeling theory, we are aware that we approximate the stein function i.e. xlogpt(x(t))\nabla_xlogp_t(x(t)) using a time-dependent score neural network sθ(x,t)s_\theta(x,t) where ptp_t is the transition probability of this stochastic process and xx is the image input.
💡

3. Why score-based is essential for this task?

  • Unlike GAN Inversion tasks, SDEditing does not solve an optimization problem; thus it comes with no dependence on a task-specific loss function.
  • Just a better way to do the Manifold Jump (More about this in the Conclusion).

4. Experiments

Initial Image and a predesigned mask image applied as a stroke action over the image.
Demonstration of Stroke based image editing on CelebA-HQ. Professor Stefano Ermon's image curated from original author's code.

👇 Panel demonstrating score based image synthesis on LSUN Bedroom and Church datasets.

View full screen for best results.


Extracted from Original ArXiv Preprint. Performance comparison for editing task on various implementation baselines including the topmost choice in the community, i.e. GAN baselines. The paper's editing algorithm SDEdit shows superior editing and feature consistency. More about the Algorithm is in the paper (and appendix).

5. Conclusion

Manifold Jump? 👣

The purpose of putting this in the conclusion section is to create a summary of the current directions in image generation research.


The future of Score Based Generative Modeling is promising:

Extracted from "Image Super-Resolution via Iterative Refinement":https://iterative-refinement.github.io/
Diffusion Models producing StyleGAN2 levels of image generation:
Courtesy: "Cascaded Diffusion Models for High Fidelity Image Generation":https://cascaded-diffusion.github.io/
Thank you for reading this report. Please field your questions in the comment section below, or DM me on https://twitter.com/sayantandas_ .
Iterate on AI agents and models faster. Try Weights & Biases today.