Qwen 2.5 14B Instruct on W&B Inference

Price per 1M tokens

$0.06 (input)
$0.24 (output)

Parameters

14.7B (active)
14.7B (total)

Context window

32768

Release date

September 2024

Qwen 2.5 14B Instruct inference details

Qwen2.5-14B is a dense 14.7B parameter causal language model from the Qwen2.5 series, released as part of a family of base and instruction-tuned models ranging from 0.5B to 72B parameters. It delivers significant improvements over Qwen2, including stronger performance in coding and mathematics through specialized expert models, better instruction following, enhanced handling of structured data and outputs such as JSON, and more robust adaptability to diverse system prompts for role-play and chatbot settings. The model natively offers multilingual support across 29+ languages, including Chinese, English, French, Spanish, Portuguese, German, Japanese, Korean, and Arabic. Qwen2.5-14B was trained with both pre-training and post-training for versatile instruction-tuned applications.
 
Created by: Alibaba
License: apache-2.0
🤗 model card: Qwen2.5-14B-Instruct
 
 
 
import openai
import weave

# Weave autopatches OpenAI to log LLM calls to W&B
weave.init("<team>/<project>")

client = openai.OpenAI(
    # The custom base URL points to W&B Inference
    base_url='https://api.inference.wandb.ai/v1',

    # Get your API key from https://wandb.ai/authorize
    # Consider setting it in the environment as OPENAI_API_KEY instead for safety
    api_key="<your-apikey>",

    # Team and project are required for usage tracking
    project="<team>/<project>",
)

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-14.B-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a joke."}
    ],
)

print(response.choices[0].message.content)

Qwen3 resources