Composer: Diffusion-based Image Synthesis with Composable Conditions

Created on February 26|Last edited on March 5

Comment

﻿
﻿
Composer is a conditional diffusion model that allows for greater control over image synthesis. The integral idea is dismantling an image into its respective components, or attributes like caption, color, sketch, etc. The model can learn to generate an image from these various representations.
The inventors also describe Composer as a general framework for a variety of generative tasks. 
The decomposition includes:
Caption: used image-text training data for captions
Semantics and style: used CLIP image embeddings
Color: color statistics from image from CIELab histogram
Sketch: used an edge detection model
Instances: YOLOv5 to detect instances in an image
Depthmap: depthmap estimation model
Intensity: introduce grayscale images to Composer to help it learn color intensity
Masking: introduce masking to allow inpainting
The model is capable of a wide variety of tasks:
Creating variations of an image by varying a certain aspect or representation
Interpolating between 2 images to create a blend
Reconfiguring directly 1 aspect of the image 
Masking out a certain region to restrict where the model can edit the image
Colorizing an image based on a color palette
Style Transfer!
Pose transfer
Virtual try-on 😂: masking the clothes of a person in 1 image and replacing it with a garment from another image
Image synthesis and generative AI are extremely popular right now. I remember when StyleGAN{1, 2, 3} came out! The images were unreal. Now a few years down the line, with the rise of diffusion models, we have models that not only generate images, but can be controlled to this degree!
It might not be long till these models get an upgrade and start generating videos (like Imagen Video but with the degree of user control that Composer has if not more!). 
References﻿https://damo-vilab.github.io/composer-page/﻿
﻿https://github.com/damo-vilab/composer﻿
﻿

Add a comment

Tags: ML News

Iterate on AI agents and models faster. Try Weights & Biases today.