Stable Diffusion and the Samplers Mystery

This report explores Stability AI's Stable Diffusion model and focuses on the different samplers methods available for image generation and their comparison.
Agata Mlynarczyk
Created on September 5|Last edited on March 10
Comment
With all the excitement surrounding the release of Stable Diffusion 2, we at Weights & Biases decided to join the fun and experiment with the model. 
In this article, I'll use DreamStudio to run Stable Diffusion and dive into how to use different samplers. 
Prompt: Detective man with a magnifying glass animated
Table Of ContentsWhat is Stable Diffusion?Stable Diffusion DreamStudioWhat are samplers, and how to use them?Samplers comparisonCLIP Guidance ComparisonConclusionRelated Reading
﻿
What is Stable Diffusion?Stable Diffusion is a well-known text-to-image model created by Stability AI that is growing in popularity.
To generate an image, you need to start with an idea of what you want to see (the 'dream') and develop a prompt describing the image. In most cases, the more detail you provide, the higher the chance the image will look like what you expected; for example, instead of a grey cat, you could use  a grey cat sitting on a chair in the kitchen, animated.
Before we get into the creation and customization of our images, let's go through how to use DreamStudio to accomplish this task. Follow along for the instructions for the DreamStudio or use our Colab to play around and log your images straight to Weights & Biases and generate a table like this! 
﻿
﻿
﻿
﻿
Stable Diffusion DreamStudioDreamStudio is an online tool that allows users to run Stable Diffusion with an easy-to-use interface.
﻿
The prompt field is capped at 499 characters, giving plenty of space to let your imagination go; however, the model is pretty good at creating images from much shorter prompts. 
Improving and experimenting with the prompt is one way of influencing the model. Another option is to adjust the parameters on the right-hand side. 
Here is the list of available parameters as of October 20, 2022:
Width - the width of the generated image.
Height - the height of the generated image.
Cfg Scale - cfg scale adjusts how much the image will be like your prompt. Higher values keep your image closer to your prompt.
Steps - how many steps to spend generating (diffusing) your image.
Number of Images - to generate multiple images from one prompt.
Sampler - the diffusion sampling method. 
Model - currently, there are two models available, v1.4 and v1.5. v1.5 is the default choice.
Seed - The seed used to generate your image. 
Initial image - you can provide the initial image for Stable Diffusion to use. 
CLIP Guidance - CLIP-guided diffusion is supposed to improve your results, however, it can only be used with one sampler k_dpm_2_ancestral.
My initial thought was: the higher some settings are, the more impressive results you can get. But, the art of Stable Diffusion settings is a bit more complicated. Moreover, the higher settings, the more credits it will cost. The MLEs are the audience who will certainly identify with this conundrum. Improvements come with costs. And higher improvements, as one would expect, come with higher costs.
As a new user, you will automatically be given 200 credits for image generation.
The default settings for image generation cost one credit per image. You can go as low as 0.2 credits/image or as high as 28 credits/image. 
Here is an example of images created from a prompt a fluffy grey cat sitting on the ground, the grand canyon in the background worth 0.2 credit/image and 11 credits/image. 
0.2 credit: Width 512; Height 512; Cfg scale 12; Steps 10; Sampler k_lms; Model v1.5; Seed 120,
11 credits: Width 768; Height 768; Cfg scale 12; Steps 120; Sampler k_lms; Model v1.5; Seed random
Visibly the first picture is a bit less impressive (the cat is not even entirely grey). In this case, it is easy to see which image was generated using more credits; however, as pointed out in multiple guides, not all samplers require settings to be high to create good outputs. It is a popular strategy for users to start with lower settings to experiment and find a good prompt and increase after they have a better idea of what they are trying to produce.
If you want to play with the generator without creating an account (or losing your credits), you can use a Google Colab attached to this article by Morgan McGuire.
﻿
﻿
﻿
Alternatively, if you'd like to take the same route as me, here are some videos worth watching to get started with DreamStudio. 
Now, if all you're looking for is to create some fun images, you can drop off here. But, if you're interested in understanding what's going on under the hood and how to improve on it, let's dig into the fun stuff.﻿﻿
What are samplers, and how to use them?The Stable Diffusion model has not been available for a long time. With the continued updates to models and available options, the discussion around all the features is still very alive. Samplers are not a popular focus for most; therefore, little information is available about them. Here is an overview of what is currently out there about Samplers in Stable Diffusion. 
The most in-depth review of samplers so far was offered by @iScienceLuvr on the following Twitter thread: 
﻿
The thread also offers a visual comparison of samplers vs. the number of steps (you can see more examples in the section below). The thread mainly focuses on what the samplers are, and where they come from. While digging around in the Twitter and Discord threads, you can discover mentions of a theory that samplers might have their 'specialties'; for example, some samplers could be better at animated generations and others at realistic or face images. However, this was not proven or fully explained yet. 
For a less technical explanation of what samplers are and how to focus on effectively using them, I recommend this Beginner's Guide created by u/pxan on Reddit, which includes an essential samplers guide, among many other valuable tips. ﻿﻿
﻿﻿Samplers comparisonSince Stable Diffusion was released, artists and non-artists started to test the different options and how they affect images with the same prompt. We can see some impressive comparisons focusing on samplers. Most of them are publicly posted on Discord and Reddit servers. Handi#2783 published two comparisons on Discord's official Stable Diffusion channel.﻿
The first one is just a regular sampler comparison. In the second comparison, we can also see how the number of steps affects different samplers. Thanks to those comparisons, we can see how different samplers behave when paired with other changes to the parameters. The last example included in this report is the comparison posted by u/CaptainH3RB on this Reddit thread of samplers and CFG for a few different prompts. 
Below you can see my comparison of images generated with prompts astronaut walking through a portal to another world and green apple in a wicker basket logged to the Weights & Biases Tables.
To create your own comparison for different parameters head to this Colab, play around with config and start logging your images into wandb tables! 
💡
Settings used for image generation: Width and Height 512x512, Cfg Scale 15, Steps 65, Seed 250(apple)/314160(astronaut), Model v1.5
﻿
﻿
CLIP Guidance ComparisonOn October 15th 2022, Stable Diffusion announced their new update to DreamStudio in a Discord update on their official channel. To enhance the quality and coherency of the generated images users now have another toggle available called CLIP Guidance.
After turning the toggle on, we can still play with the rest of the options except for the samplers, as it defaults to k_dpm_2_ancestral. Another difference is the minimum number of Steps we can select, which normally is 10; however, with CLIP Guidance toggle on, the minimum is 35 steps.
Many users reported great improvements in their generated images so I decided to check if I could see a difference between images generated for three different prompts.
Prompt: Picasso cat
Settings: Width and Height 512x512, Cfg Scale 15, Steps 65, Seed 2481483669, Model v1.5
﻿
﻿
Prompt: old rusted washing machine 
Settings: Width and Height 512x512, Cfg Scale 15, Steps 65, Seed 106589, Model v1.5﻿﻿
﻿
﻿
Prompt: dancing skeleton in the rain 
Settings: Width and Height 512x512, Cfg Scale 9, Steps 40, Seed 985324, Model v1.5
﻿
﻿
ConclusionI very much enjoyed exploring DreamStudio and learning from other users their strategies to producing a good output. The addition of CLIP Guided generation improved results when compared to images generated with the same parameters but without CLIP Guidance.
That said, this is not a universal experience as some users reported little improvement. I would like to focus next on another exciting function Stability AI introduced, image editing, which enables users to upload an image, erase a part of it or expand the empty canvas and use DreamStudio to generate alternative fillings. 
Since the release of DreamStudio, users have been generating millions of images and analyzing different prompts and options. Thanks to those, we have more understanding and simple guides on various options.
Currently, this knowledge is scattered across different Reddit, Discord pages, and other corners of the internet. While users are working hard to decrypt the DreamStudio wonders, the Stability AI team is hard at work to produce more and more impressive functions and models and seeing the fast paced improvements, we can only dream to fully explore their possibilities. 
Related Reading
Emad Mostaque — Stable Diffusion, Stability AI, and What’s Next
Emad shares the story and mission behind Stability AI, a startup and network of decentralized developer communities building open AI tools.
Making My Kid a Jedi Master With Stable Diffusion and Dreambooth
In this article, we'll explore how to teach and fine-tune Stable Diffusion to transform my son into his favorite Star Wars character using Dreambooth.
Improving Generative Images with Instructions: Prompt-to-Prompt Image Editing with Cross Attention Control
A primer on text-driven image editing for large-scale text-based image synthesis models like Stable Diffusion & Imagen
﻿
﻿
﻿
﻿
﻿