Skip to main content

Find the best parameters to finetune a LLM Using Sweeps on Launch

Created on August 17|Last edited on August 17

Setup CUDA, Docker and Nvidia Container Toolkit

sudo apt-get update
sudo apt-get upgrade
# sudo apt install nvidia-driver

If you already have CUDA, Docker and nvidia-container-toolkit installed on your machine then you can skip these steps.

How to install Nvidia CUDA on my Linux Debian virtual machine?

The below commands will install CUDA 11.8.0, you can find these instructions from nvidia here
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run

sudo sh cuda_11.8.0_520.61.05_linux.run
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run


How do install docker on your linux machine

As of writing, the instructions found on the Docker install page worked to install docker onto a fresh linux virtual machine:
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg

sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo \
"deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# verfiy
sudo docker run hello-world


How to install Nvidia Container Toolkit on your linux machine

Following these instructions from nvidia to install nvidia-container-toolkit works for me
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

# verify, should should nvidia-smi outout:
sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi

Build or Pull Your Docker Image

How to build your Docker Image

If you don't already have a Docker Image to pull then you can build one from a Dockerfile using this command:
docker build --no-cache -t <IMAGE NAME> -f Dockerfile .
  • --no-cache : this will ignore any existing images with the same name and build the image again from scratch
  • -t IMAGE NAME : select a name for the image and optionally a tag (after : ), like pytorch-2.01_cuda-11.8:linux
  • -f : specify the name of the Dockerfile. If your file is just called Dockerfile then you don't need this
  • Lastly specify the build context, the set of files and directories available to the build process, for Docker. In this case we just specify the current directory where the Dockerfile resides.