There are two kinds of people building deep learning models: those who do everything in Docker, and those who should be doing everything in Docker.
Here at Weights & Biases, we are working towards zero-overhead reproducibility by making it easy to use Docker.
Run this simple command from inside your ML project:
wandb docker
This sets up a machine learning Docker image with standard packages installed, mounts your code, and places you inside of it. You can now train your models exactly the way you were doing before, with the added benefits of Docker. During model training, wandb will save the digest, which is a permanent record of your Docker image state. This means you will always be able to recover the exact environment your code ran in.
At any time in the future, you can run the following to be taken back to the exact state your code and Docker image were in during your training run:
wandb restore <username>/<project>:<run_id>
Weights & Biases pre-fills your bash history with the original command.
If you already have a Docker image or want to use one of the popular pre-built images, pass the image name. For example:
wandb docker floydhub/dl-docker:cpu
This command loads the Floydhub Deep Learning Docker image for CPUs.
To analyze results or launch runs with Jupyter, run the following:
wandb docker --jupyter
This installs Jupyter and starts JupyterLab on port 8888.
We automatically track the digest to ensure environment replication in the future. You can pass the digest manually by setting the WANDB_DOCKER environment variable. We also provide a helper to get the digest for workflows that launch docker images manually: wandb docker image_name --digest. For more advanced users who already have Docker in their workflows, we provide a new command:
wandb-docker-run
Much like nvidia-docker, this command is a simple wrapper that injects the WANDB_DOCKER and WANDB_API_KEY environment variables to your existing Docker run calls. For users running their payloads in kubernetes, our latest client will populate the digest automatically if the k8s control plane api is exposed to the pod.
You can learn more about our Docker support in our documentation.