How to integrate W&B Registry with Friendli Dedicated Endpoints

Learn how to deploy generative models with W&B and Friendli
Created on June 3|Last edited on June 4
Comment
﻿
This post was authored by Minju Kim, Ahnjae Shin, and Byung-Gon Chun from the team at FriendliAI
💡
In this post we’ll look at how to deploy better, more performant generative models with Weights & Biases and Friendli. It involves Frendi’s Dedicated Endpoints alongside a suite of Weights & Biases features. 
Friendli Dedicated Endpoints by FriendliAI offers a turnkey, fully managed solution for deploying generative AI models. Combining powerful optimization techniques with automated infrastructure management, it ensures smooth and reliable operation at scale. 
As a W&B user, you likely rely on the W&B Registry to manage the lifecycle of your models—from tracking experiment artifacts to promoting the best performing models for production use. 
By integrating Friendli Dedicated Endpoints directly into this workflow, you can: 
Deploy your model from W&B Registry: Seamlessly deploy your model artifacts from W&B Registry to Friendli Dedicated Endpoints with minimal effort. Tagging a model artifact version with an alias automatically triggers the creation of an endpoint in Friendli Dedicated Endpoints using webhook automation. It eliminates the need for custom scripts or manual configurations.
Ensure deployment consistency: Prevent duplicate or conflicting operations, and gain confidence in the consistency of your deployment. Friendli Dedicated Endpoints include support for idempotencyKey to ensure the reliability of automated workflows. Each deployment trigger via webhook automation is tracked with a unique idempotencyKey, ensuring that operations such as endpoint creation or updates are processed exactly once. 
Track deployment versions during rollout: Friendli Dedicated Endpoints support versioning so you can track every update to your model deployments. Reassigning an alias to a new artifact version automatically rolls out the update. This allows you to audit changes, revert to previous versions if needed, and ensure a smooth transition between model updates.
To integrate with Friendli Dedicated Endpoints, you should use W&B webhook automation. To do this, you will need both a Friendli Suite account with access to Friendli Dedicated Endpoints and a personal access token generated through Friendli Suite.
Let's walk through the process.
QuickStart
Create a secretFirst, navigate to the team’s page on W&B and click on "Team settings." After clicking on "New secret," fill in the Secret with a personal access token generated through Friendli Suite. Check out our docs for more details on how to W&B secrets work. 
﻿
﻿
Configure a webhookAfter clicking on "New webhook," fill in the "URL" field with Friendli Suite Rest API URL and "Access token" field with the secret already created through Friendli Suite. 
﻿
Create a webhook automationBy setting up webhook automation, you can automate the process of deploying models to Friendli Dedicated Endpoints. 
Go to your W&B Registry and click on "View details" of the model you want to deploy, then click "Create automation." For the "Event type," select "An artifact alias is added." For the "Alias regex," define a value that will be used later when deploying the endpoint with the alias. We recommend choosing a short and concise value. For "Action type," select Webhooks, then select the webhook configured with Friendli Dedicated Endpoints. For "Payload," we recommend filling out the box by referring to the following example. You can also follow the detailed instructions here. Here's an example configuration:
{
   "wandbArtifactVersionName": "${artifact_version_string}",
   "name": "Generated from WandB ${project_name}/${artifact_collection_name}"
}
...where: 
wandbArtifactVersionName is the specific model artifact version from W&B, and
name (optional) the name of your endpoint (if not provided, a name will be automatically generated)
﻿
Deploy a model artifactNow you can deploy your model artifact to Friendli Dedicated Endpoints by adding or updating an alias to a model artifact version.
﻿
Once you've added the alias, you can see the endpoint created in Friendli Dedicated Endpoints.
﻿
Just add or update the alias, and the integration handles the rest. That said? A couple things you should keep in mind: 
When adding an alias to a model artifact version for the first time, an endpoint will be created in either an existing or a new project within your default team of Friendli Suite. 
When an alias is moved to another model artifact version, the created endpoint will be reassigned to a new version in the previously assigned project.
Roll out a model artifactTo roll out an endpoint to a new model artifact version, add an alias to the new version you would like to deploy while using the same alias as the previous version. This updates the endpoint to use the new model artifact version. After assigning the alias, the endpoint will update to reflect the new version in Friendli Dedicated Endpoints.
Note that to roll out an endpoint, an idempotencyKey field is required. This time, let’s create a webhook with additional configurable fields in the payload. Here's an example: 
{
   "wandbArtifactVersionName": "${artifact_version_string}",
   "name": "Generated from WandB ${project_name}/${artifact_collection_name}",
   "projectId": "project-id",
   "idempotencyKey": "${alias}",
   "accelerator": {
      "type": "NVIDIA H100",
      "count": 1
   },
   "autoscalingPolicy": {
      "minReplica": 0,
      "maxReplica": 2,
      "cooldownPeriod": 300
   }
}
Details:
projectId: (optional) Specifies which project of the endpoints will be created. If not provided, a new project will be created. Adjust the configuration example to your  Friendli Dedicated Endpoints project ID.
idempotencyKey: (optional) Used by Friendli Dedicated Endpoints to track which webhook automation triggered an endpoint rollout. Use any unique value, but using the value provided is recommended.
accelerator: (optional) Specifies the hardware for the endpoint. You can define the type and the number of accelerators to be used simultaneously.
autoscalingPolicy: (optional) Specifies the autoscaling settings for the endpoint.
Note that idempotencyKey is an optional field, but is required to roll out an endpoint between different model artifact versions. View more details about each field here.
💡
To gain more control over GPU resources for an endpoint, configure the accelerator field by specifying the desired type and count. This is particularly useful for serving large models that require model or data parallelism. An example of payload with multiple accelerators: 
{
   "wandbArtifactVersionName": "${artifact_version_string}",
   "name": "Generated from WandB ${project_name}/${artifact_collection_name}",
   "accelerator": {
      "type": "NVIDIA A100 80GB",
      "count": 4
   },
}
Track the history of deployment versionsUse the Friendli Dedicated Endpoints versioning feature to track the history of your model deployments and maintain a clear record of every update. By adding an alias to a model artifact version, you can deploy models and roll out updates across versions efficiently, without needing to create a new endpoint from scratch. Note that when an alias is reassigned to a different version, the existing endpoint will automatically roll out to the new version.
﻿
﻿
In the diagram above, v0 represents the first deployed version of the model when the endpoint was created while v1 is a newer model artifact version that the alias was reassigned to, triggering a rollout to update the endpoint accordingly. You can view more details about the versioning feature here.
Two quick FAQs (click to expand)
ConclusionThis post is meant as a quick start for hooking your W&B model training pipelines to Friendli for deployment. You can check out their docs and our docs for any questions. And if you have any feedback or issues about the integration with Friendli Dedicated Endpoints please contact us at support@friendli.ai
﻿
Add a comment
Tags: Articles, Framework / Integration, Agents, Registry
Iterate on AI agents and models faster. Try Weights & Biases today.