Skip to main content

Running Stable Diffusion on an Apple M1 Mac With HuggingFace Diffusers

In this article, we look at running Stable Diffusion on an M1 Mac with HuggingFace diffusers, highlighting the advantages — and the things to watch out for.
Created on August 31|Last edited on July 3
HuggingFace diffusers provide a low-effort entry point to generating your own images, and now it works on Mac M1s — as well as GPUs!
In this article, we explore the advantages of running Stable Diffusion on a Mac M1 with HuggingFace diffusers and explore how you can take full advantage.
Here's what we'll be covering:

Table of Contents



Let's get started.

Stable Diffusion Text to Image on a Mac M1

The steps below worked for me on a 2020 Mac M1 with 16GB memory and 8 cores. Each inference step takes about ~4.2s on my machine, e.g., 1 512x512 image with 50 steps takes 3.5 minutes to generate. A brief example:

prompt
image_0
image_1
image_2
image_3
image_4
image_5
image_6
image_7
guidance_scale
num_inference_steps
2
Some Stable Diffusion generations created using the colab here

My System and Python Versions

Relevant versions & systems I'm using here:
  • MacOS 12.4
  • Python 3.8.8
  • torch 1.13.0.dev20220830
  • diffusers 0.2.4
  • transformers 4.21.2

Setup Steps

1. MacOS Version

Make sure your Mac is running MacOS 12.3 or above. Recommend that you run a system update if not.

2. Install PyTorch nightly

As of today (August 31st 2022), the latest PyTorch nightly release addresses the aten::index.Tensor error when running this on a Mac M1:
NotImplementedError: The operator 'aten::index.Tensor' is not current implemented for the MPS device
Installing PyTorch nightly using conda:
conda install pytorch torchvision torchaudio -c pytorch-nightly
Installing PyTorch nightly using pip:
pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu

3. Patch BasicTransformerBlock From Diffusers Using Fastcore

Another fix has to be made to the attention calculation to support M1 Macs. For this, we will use the patch function from the wonderful fastcore library (it really is amazingly powerful---I highly recommend checking it out).
Patching BasicTransformerBlock will fix this error that you would otherwise get:
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead
Nobody wants that.
from diffusers.models.attention import BasicTransformerBlock

from fastcore.basics import patch

@patch
def forward(self:BasicTransformerBlock, x, context=None):
# x = self.attn1(self.norm1(x)) + x
x = self.attn1(self.norm1(x.contiguous())) + x # <--- added x.contiguous()
x = self.attn2(self.norm2(x), context=context) + x
x = self.ff(self.norm3(x)) + x
return x
Thanks to @fragmede on GitHub who identified the x.contiguous fix here

4. StableDiffusionPipeline Setup

When setting up your StableDiffusionPipeline there are a few things to watch for:
  • do not use revision="fp16"
  • do not use torch_dtype=torch.float16
  • set pipeline device to "mps"
from diffusers import StableDiffusionPipeline

DEVICE='mps'

pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
use_auth_token=True
).to(DEVICE)

5. Run Inference!

Hopefully, you are now all set to generate images on your M1 Mac!
image = pipe(prompt="a happy dog working on a macbook air, sythwave")["sample"][0]

Other Notes

This is what works for me. However, these codebases are changing so fast that there is a chance that newer bugs might appear or some of these fixes become redundant!
Also, note that I didn't have to set the PYTORCH_ENABLE_MPS_FALLBACK environment variable to make this work on my M1 Mac.

You might also like this Report


Sebastian Topalian
Sebastian Topalian •  *
Hey Morgan, Thank you for the article - I had success after struggling for some time, but then I created a new environment with python 3.8.8, diffusers 0.2.4, transformers 4.21.2 and then I installed PyTorch nightly as described and followed the remaining steps. Before I was running: torch = 1.12.1, diffuser 0.4.2, transformers 4.23.1 and python 3.10.4 :) Thx!
1 reply
Dennis Faucher
Dennis Faucher •  
You should add one more line at the end: image.save("diffusion.png")
Reply
Dennis Faucher
Dennis Faucher •  *
Having an issue with device='mps'. Getting this error from the StableDiffusionPipeline command "RuntimeError: Expected one of cpu, cuda, xpu, mkldnn, opengl, opencl, ideep, hip, ve, ort, mlc, xla, lazy, vulkan, meta, hpu device type at start of device string: mps" I'll hack at it. Fixed it $ pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu (Install RUST) $ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh $ pip install transformers $ pip install diffusers
Reply
Iterate on AI agents and models faster. Try Weights & Biases today.
List<Maybe<File<(table)>>>