Wandb integration with NIM (internal or partner only)
Created on April 25|Last edited on April 25
Comment
What is NIM?
"NVIDIA NIM is designed to bridge the gap between the complex world of AI development and the operational needs of enterprise environments, enabling 10-100X more enterprise application developers to contribute to AI transformations of their companies. " (NVIDIA NIM Offers Optimized Inference Microservices for Deploying AI Models at Scale)

Why is NIM needed?
Pain points in the process of LLM-based app development are;
- Prediction interfaces and output formats differ for each model. -> This makes it difficult to compare many models quickly.
- Optimize GPU usage during inference requires specialized knowledge.
So, standardized interface and easy deployment on any devices in an optimal way are important for LLM-based app development
How does NIM work?

What is the difference between NIM and TensorRT-LLM / Triton inference Server?

WandB integration with NIM
Now, you can kick NIM process with wandb launch by picking up a model in wandb model registry.

Only A100 level GPU are compatible
💡
Demo

Get started
github: https://github.com/wandb/launch-jobs/tree/main/jobs/deploy_to_nvidia_nemo_inference_microservice
Step2: Create a queue
git clone https://github.com/wandb/launch-jobs.git
Step3: install libraries
pip install -r requirements.txt
Step4: create job
wandb job create \-n "deploy-to-nvidia-nemo-inference-microservice" \-e wandb-japan \-p nimtest \-E jobs/deploy_to_nvidia_nemo_inference_microservice/job.py \git https://github.com/wandb/launch-jobs
Step5: create agent & run agent
confignet: hostgpus: allvolume:- /mnt/batch/tasks/shared/LS_root/mounts/clusters/nim/code/launch-jobs/jobs/deploy_to_nvidia_nemo_inference_microservice/artifacts:/launch/artifacts/- /mnt/batch/tasks/shared/LS_root/mounts/clusters/nim/code/launch-jobs/jobs/deploy_to_nvidia_nemo_inference_microservice/model-store:/model-store/runtime: nvidiaenv-file: /home/nvidia/launch-jobs/jobs/deploy_to_nvidia_nemo_inference_microservice/.env
Step6: create queue & kick off
yaml{"args": [],"entry_point": [],"run_config": {"artifact":"vanpelt/support-llama/merged:v0","artifact_model_type":"llama"}}
- Model registry (wandb only):
Add a comment