Pricing that works for everyone

The best tools for AI developers

Serverless RL

Serverless RL runs the rollout and training phases on separate infrastructure, eliminating GPU idle time caused by lingering rollouts. You pay only for active usage, not idle time

Model

Input Tokens

Output Tokens

Qwen 2.5 14B (tenantive)
1
1

Frequently asked questions

How are GPU-hours for training calculated?

GPU-hours are calculated by aggregating the total time used to train your models during the last billing cycle. Training a single step requires GPU time for three actions: downloading the most recent LoRA to train from, adjusting the LoRA weights using GRPO, and saving the updated weights. Since the downloading and saving processes only take a few seconds each, the bulk of a training step is dedicated to actually training your model.

Is there a minimum charge for training?

No, jobs are billed for the GPU time they use, with no minimum training duration.

What if the job fails? Do failed jobs get billed partially, fully, or not at all?

GPU time for failed jobs will not be charged to the user’s account.

How will I know how many tokens I’ve used each month?

A token is a mathematical representation of natural language. Log in to your account to view your billing dashboard⁠. This dashboard will show you how many tokens you’ve used during the current and past months.