Is the New M2Pro Mac Mini a Deep Learning Workstation?

In this article, we explore whether the recent addition of the M2Pro chipset to the Apple Mac Mini family works as a replacement for your power hungry workstation.

Thomas Capelle

Created on February 23|Last edited on July 18

Comment

Apple recently released M2Pro chipset to the Apple Mac Mini family. But can you use it as a replacement for your power hungry workstation? In this article, we'll find out just that.
Table of ContentsUnboxingSetupDL Benchmarks on the M2Pro Mac MiniTensorFlowPyTorchConclusion
﻿
UnboxingAs usual for Apple, the Mac Mini comes nicely packaged in a cardboard box to protect the Apple box. Inside the box, you'll find only a power cable, nothing more. This is a BYODKM (Bring you own display, keyboard, and mouse 🤣.) 
﻿
As I already have a setup with a thunderbolt dock, it was as simple as plugging the Mac Mini in, and I was good to go. I would be fantastic if power could also come via the dock (🤓) but nothing's perfect, I suppose.
SetupAfter setting up the usual Apple stuff (like the AppleID, username, and password and waiting almost 30 minutes for the OS update), I was ready to install the libraries to test this baby. 
﻿
I would typically install more things on a new machine, but as I will return this one, I won't bother to install all my configurations and tools. 
Install iTerm2 and VSCode﻿
Log to GitHub and create a new ssh key. (not needed if you are not pushing to GitHub)
Next, you'll need to install the developer utilities from Apple. To do so, open a terminal and try to call git. You will be prompted to install developer tools.
﻿
﻿
Python SetupThe easiest way to grab Python and an environment manager for me is using Anaconda. I like the minimal distributions available on MiniForge. You can grab it directly from the website or running this on a terminal:
$ curl -L -O https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
$ sh Miniforge3-MacOSX-arm64.sh
Follow the on screen instructions and when prompted to initialise the terminal, say yes.
DL Benchmarks on the M2Pro Mac MiniWe ran two training scripts:
Train a ResNet50 on images of 128x128 for one epoch.
Train BERT for one epoch.
We have both TensorFlow and PyTorch implementations that are somewhat equivalent.
We initially ran deep learning benchmarks when the M1 and M1Pro were released; the updated graphs with the M2Pro chipset are here. As we made extensive comparison with Nvidia GPU stack, here we will limit the comparisons to the original M1Pro. You can find code for the benchmarks here. 
PyTorch Runs On the GPU of Apple M1 Macs Now! - Announcement With Code Samples 
Let's try PyTorch's new Metal backend on Apple Macs equipped with M1 processors!
Deep Learning on the M1 Pro with Apple Silicon 
Let's take my new Macbook Pro for a spin and see how well it performs, shall we?
﻿
TensorFlowThere was an issue with latest tensorflow-metal and Adam optimiser compatibility, the solution was to fallback to tensorflow.keras.optimizers.legacy.Adam.
💡
Tensorflow tends to work faster than PyTorch, with less lag between epochs. You can install TensorFlow by running:
$ conda create --name=tf "python<3.11"
﻿
# activate the environment
$ conda activate tf
$ conda install -c apple tensorflow-deps
$ pip install tensorflow-macos tensorflow-metal 
﻿
# install benchmark dependencies
$ pip install wandb transformers datasets tqdm scikit-learn
﻿
Run set13
﻿
﻿
﻿
Run set11
﻿
All told, we're seeing a roughly 15% improvement here. So far so good. 
PyTorchSince PyTorch 1.12 ,the Metal backend is available on master. To install PyTorch you can do:
$ conda create --name="pt" "python<3.11"
﻿
# activate the environment
$ conda activate pt
$ conda install pytorch torchvision -c pytorch
﻿
# install dependencies of this training script 😎
$ pip install wandb tqdm transformers datasets
﻿
Run set48
﻿
﻿
﻿
Run set61
﻿
On PyTorch, we see more marked performance improvements, somewhere in the neighborhood of 18%. 
ConclusionThe new Mac Mini equipped with the M2Pro processor is a silent little powerhouse. It's a fantastic Python data science workstation, equipped with fast SSD and a great operating system, but...it's not fast enough to be used as a training rig. You can prototype your next PyTorch/TensorFlow model, but you are not training the new LLM or diffusion model on this hardware. 
We are still miles apart from the desktop NVIDIA GPUs and the same analysis from the M1Pro hold today. It's nice seeing Apple capable of improving the GPU performance over the previous generation, but we will probably have to wait to replace our NVIDIA GPUs.
Don't get me wrong, the performance per watt is good but we are still far behind what you get on any current Nvidia desktop GPU.  Check this report to see how they compare against NVIDIA. ﻿
I would have loved it if Apple had priced this machine more aggressively, at around $1,000 (instead of $1,299). Still, this is an improvement in performance over the M1 so if you're in the market for a workstation, definitely prioritize the newer models. They are indeed better than their predecessor.
﻿

Add a comment

Samyak • 2 years ago

Hi, Is this Mac mini compare with Nvidia A100 ? I'm looking for comparing basic GPU 8gb with Mac mini .

Awsaf • 2 years ago

Nice work. What is the GPU memory for M2 pro? From net it shows it has 96GB of unified memory does it mean it GPU memory?

1 reply

Theo Adrai • 2 years ago

I think the author should change the way results are reported (this would better align with the article conclusion btw). Right now, it's quite misleading: - The A100 card has <1% utilization... this is likely because the benchmark evaluates performance on an 8-year-old task (i.e. training Resnet50 for classification - LOL). - On a more relevant task like BERT training, the performance of the A100 is not reported (likely because it completely annihilates the M2 chip).

1 reply

Auto Meta • 2 years ago

my guy, that is not a desktop nvidia gpu you ran that against, an a100 is 15 grand, thats a 500$ workstation keeping up with it

1 reply

Maynard Handley • 3 years ago

What, in this context, does "GPU Utilization %" mean? What specifically is it measuring and how is the measurement achieved? For example if it's at 60%, is that telling us that the other 40% of the time we are waiting on CPU? Or that we are waiting on DRAM? Or that only 60% of the FP capacity of the GPU is being used? Or ...?

1 reply

Tags: Articles, Experiment, Benchmark, Beginner, Domain Agnostic, Panels

Iterate on AI agents and models faster. Try Weights & Biases today.