Skip to main content

GPT-5 Python quickstart using the OpenAI API

Getting set up and running GPT-5 on your machine in Python using the OpenAI API.
Created on August 1|Last edited on August 8
The moment GPT-5 launched, we knew we had to dive in and play with some of the new API capabilities. GPT-5 introduces reasoning effort controls, native structured outputs, built-in image generation, and seamless multimodal workflows that could really change how we build AI applications.
In this tutorial, you'll learn about GPT-5's most powerful features in under 15 minutes (seriously). We'll explore the new Responses API, create guaranteed JSON outputs with Pydantic, generate images that perfectly complement your text, and compare different reasoning effort levels, all while logging everything with W&B Weave for complete visibility into your AI workflow. It's also available on GitHub here.
By the end, you should have some fresh ideas about building or upgrading your production-ready applications. We'll talk about some of the improvements in GPT-5 up front but if you're eager to get straight to the code, just click the button below.
Jump to the tutorial


Table Of Contents


If you're just getting started and don't yet have your machine set up to run Python, I've created a quick tutorial here that will have you up-and-running in just a few minutes.
💡
Once you're set up, you're going to need an OpenAI API. So let's start there:

Getting started with the GPT-5 API

Before you can use GPT-5 via the API, you'll need a valid API key. Just head to OpenAI’s product page and click "Start building" to begin the signup flow.

After signing up, verify your email and phone number to unlock the dashboard. That's where you can manage your projects and keys.

Creating your OpenAI API key

In the dashboard, click the current project name (e.g., Default project) in the top-left.

Once you have a project, go to settings:

And then to your API keys:

Inside your project, navigate to API keys and click "Create key." You'll be asked to give it a name.

Next, just decide what permissions you'd like it to have. You can read more about these here.
Don't save your key unless you have someone highly secure (a notepad doc on your desktop is not). Once you're used it momentarily, close the notepad doc without saving.
💡

OpenAI pricing for GPT-5

You get a five dollar credit upon signing up, which will get you surprisingly far if you're just experimenting! As of this writing (August 7, 2025) here's the pricing for GPT-5:


W&B Weave

With W&B Weave, every call to your decorated functions is captured automatically—inputs, outputs, execution metadata, and environment details are all recorded without extra code. Just add the @weave.op() decorator to your generation function, and Weave will handle the rest. After execution, visit the Weave dashboard to explore interactive traces: view prompts alongside model responses, examine timing and resource usage, compare runs side by side, and annotate or share results with your team, all in one unified interface.

Tutorial: Getting started with GPT-5

You can follow along in Jupyter Notebook, Colab, or any Python REPL. If you need a quick Python setup, see this tutorial. Otherwise, let’s jump into the code.
One of the great things about Jupyter Notebooks is that, like Colabs, you can add your comments and context in markdown fields:


Step 1: The OpenAI GPT-5 API key

Begin by exporting your OpenAI API key as an environment variable:
%env OPENAI_API_KEY=KEY
Replace KEY with the value yours. When you run this in Jupyter, it will echo back the key, confirming it’s active.


Step 2: Installing OpenAI and W&B Weave

This installs both the official OpenAI client and Weights & Biases Weave.
This is a good time to sign up for Weights & Biases, to save you having to interrupt your workflow in a couple minutes, when you're further along.
!pip install openai weave
and run the cell.
Wait for the cell to finish—the [*] indicator will turn into a number.
💡

At this point we've installed it but we still need to import it for use
If you're new to Python, basically when we install a library we simply grab the code. When we import it, we make that library available for use.

I read a great explanation once. Installing is like uploading an image to a server. Importing is like embedding that image on a page once it's available.
💡

Step 3: Import libraries and pass the OpenAI API key

Next, load your packages and set up both Weave and the OpenAI client:
import os
import weave
from openai import OpenAI

# Initialize Weave
weave.init('gpt-5-tutorial')

# Initialize OpenAI client
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))

Step 4: Crafting your first GPT-5 call

Prompt design remains critical. Start by defining an assistant role and the user’s request:
gpt_assistant_prompt = "You are a " + input ("Who should I be, as I answer your prompt?")
gpt_user_prompt = input ("What do you want me to do?")


Step 5: Text generation with the GPT-5 Responses API

GPT-5 introduces several powerful features:
  • Responses API: A more flexible way to interact with models
  • Reasoning control: Adjust computational effort for complex problems
  • Enhanced role system: Developer, user, and assistant message roles
  • Structured outputs: Generate valid JSON conforming to schemas
  • Image generation: Built-in image creation capabilities
The new Responses API offers more control and better structured outputs than previous GPT's.
Additionally, the functions below use the @weave.op() decorator so Weave will log everything.
@weave.op()
def generate_with_custom_prompts(assistant_prompt: str, user_prompt: str, reasoning_effort: str = "low") -> dict:
"""Generate text using your custom prompts with GPT-5's Responses API"""
response = client.responses.create(
model="gpt-5",
reasoning={"effort": reasoning_effort},
input=[
{
"role": "developer",
"content": assistant_prompt
},
{
"role": "user",
"content": user_prompt
}
]
)
return {
"output_text": response.output_text,
"assistant_role": assistant_prompt,
"user_request": user_prompt,
"reasoning_effort": reasoning_effort
}

# Use your custom prompts
basic_result = generate_with_custom_prompts(gpt_assistant_prompt, gpt_user_prompt, "medium")
print("Generated response:", basic_result["output_text"])
When run, these appear as:

You'll notice we passed some additional parameters with the request. These are passes to Weave for logging, and can be easily compared with other prompts and configurations.



Viewing GPT-5 inputs and outputs in W&B Weave

After running the final cell, your notebook output will include links to each Trace. Clicking a link opens the Weave dashboard, where you can inspect inputs, outputs, code versions, execution time, and peer comments—all in rich visual form.

You'll see something like the following:


You can also click the tabs in the top right to get additional details including:
The code that generated that Trace:

Feedback on the Trace from your peers or other.
And a summary of the environment that was running:

Step 7: GPT-5 reasoning control - Low, Medium, and High effort testing

Now we'll explore one of GPT-5's most powerful features: reasoning effort control. This allows you to adjust how much computational power the model dedicates to thinking through your request. Think of it like asking someone to give you a quick answer versus asking them to really think deeply about the problem.
The three effort levels work differently:
  • Low effort: Fast and efficient, good for straightforward tasks
  • Medium effort: Balanced approach that works well for most use cases
  • High effort: Deep thinking for complex problems that require careful reasoning
Let's see how our custom assistant performs at each level:
@weave.op()
def generate_with_reasoning_levels(assistant_prompt: str, user_prompt: str) -> dict:
"""Compare your prompts across different reasoning effort levels"""
results = {}
efforts = ["low", "medium", "high"]
for effort in efforts:
print(f"Generating with {effort} reasoning effort...")
response = client.responses.create(
model="gpt-5",
reasoning={"effort": effort},
input=[
{
"role": "developer",
"content": assistant_prompt
},
{
"role": "user",
"content": user_prompt
}
]
)
results[effort] = {
"output": response.output_text,
"effort_level": effort
}
return results

# Compare reasoning levels with your prompts
reasoning_comparison = generate_with_reasoning_levels(gpt_assistant_prompt, gpt_user_prompt)

print("\n🧠 Your Prompts with Different Reasoning Levels:")
for effort, result in reasoning_comparison.items():
print(f"\n--- {effort.upper()} EFFORT ---")
print(result["output"])
print("-" * 50)
We now get our results for all three reasoning levels sent to Weave, where we can keep them logged for easy comparison:

You'll notice that higher effort levels typically produce more thorough, nuanced responses, but they take longer and cost more. This is particularly useful when your assistant role requires deep expertise or complex problem-solving.

Step 8: Structured JSON output with your custom assistant

One of GPT-5's breakthrough features is Structured Outputs - the ability to guarantee that the model's response will be valid JSON that conforms to a specific schema. This is incredibly powerful for applications that need reliable, parseable data.
The OpenAI Responses API supports structured outputs using the text_format parameter with Pydantic models or JSON schemas. Let's explore both approaches:

Method 1: Using Pydantic models

Uses OpenAI's native structured outputs with Python type definitions for guaranteed schema compliance.
First, let's define our Pydantic model and function separately to avoid any parsing issues:
# Import required libraries
from pydantic import BaseModel
from typing import List

# Define the Pydantic model
class AssistantResponse(BaseModel):
response: str
key_points: List[str]
confidence_level: str
follow_up_questions: List[str]

# Define the function - simplified for Weave compatibility
def generate_structured_with_pydantic_simple(assistant_prompt, user_prompt):
"""Generate structured output using Pydantic models - Weave-compatible version"""
response = client.responses.parse(
model="gpt-5",
input=[
{"role": "developer", "content": assistant_prompt},
{"role": "user", "content": user_prompt}
],
text_format=AssistantResponse
)
result = {
"structured_data": response.output_parsed.model_dump(),
"raw_output": response.output_text,
"assistant_role": assistant_prompt,
"validation_status": "valid"
}
return result

# Alternative: Manual logging approach to avoid Weave parsing issues
def log_and_call_pydantic(assistant_prompt, user_prompt):
"""Manually log the call to avoid Weave parsing errors"""
print(f"📝 Calling Pydantic structured output...")
print(f" Assistant: {assistant_prompt[:50]}...")
print(f" User: {user_prompt[:50]}...")
result = generate_structured_with_pydantic_simple(assistant_prompt, user_prompt)
print(f"✅ Success! Generated {len(result['structured_data'])} structured fields")
return result

# Test with Pydantic approach
print("Testing Pydantic structured output...")
try:
pydantic_result = log_and_call_pydantic(gpt_assistant_prompt, gpt_user_prompt)
print("✅ Pydantic Structured Output:")
print(pydantic_result["structured_data"])
print(f"Validation Status: {pydantic_result['validation_status']}")
except Exception as e:
print(f"❌ Pydantic method failed: {e}")
print("Falling back to JSON schema method...")
Which will print:


Method 2: Using JSON schema

Uses OpenAI's native structured outputs with explicit JSON schema definitions for precise control.
def generate_structured_with_json_schema(assistant_prompt, user_prompt):
"""Generate structured output using JSON schema"""
schema = {
"type": "object",
"properties": {
"response": {
"type": "string",
"description": "Main response from the assistant"
},
"key_points": {
"type": "array",
"items": {"type": "string"},
"description": "Key points or takeaways"
},
"confidence_level": {
"type": "string",
"enum": ["low", "medium", "high"],
"description": "Assistant's confidence in the response"
},
"follow_up_questions": {
"type": "array",
"items": {"type": "string"},
"description": "Suggested follow-up questions"
}
},
"additionalProperties": False,
"required": ["response", "key_points", "confidence_level", "follow_up_questions"]
}
response = client.responses.create(
model="gpt-5",
input=[
{
"role": "developer",
"content": assistant_prompt
},
{
"role": "user",
"content": user_prompt
}
],
text={
"format": {
"type": "json_schema",
"name": "assistant_response", # This was missing!
"strict": True,
"schema": schema
}
}
)
import json
parsed_data = json.loads(response.output_text)
return {
"structured_data": parsed_data,
"raw_output": response.output_text,
"assistant_role": assistant_prompt,
"schema_used": schema,
"validation_status": "valid"
}

# Test with JSON schema approach
print("Testing JSON schema structured output...")
try:
schema_result = generate_structured_with_json_schema(gpt_assistant_prompt, gpt_user_prompt)
print("✅ JSON Schema Structured Output:")
print(schema_result["structured_data"])
print(f"Validation Status: {schema_result['validation_status']}")
except Exception as e:
print(f"❌ JSON Schema method failed: {e}")
print("This may be due to API differences - the Pydantic method above is more reliable.")
Which yields:

Or in Weave:


Method 3: Fallback with prompt engineering

Uses careful prompting and validation when native structured outputs aren't available or working.
@weave.op()
def generate_structured_fallback(assistant_prompt: str, user_prompt: str) -> dict:
"""Fallback method using careful prompt engineering"""
json_instructions = '''
You must respond with valid JSON in exactly this format:
{
"response": "your main response here",
"key_points": ["point 1", "point 2", "point 3"],
"confidence_level": "low|medium|high",
"follow_up_questions": ["question 1", "question 2"]
}
Output ONLY the JSON, no additional text.
'''
response = client.responses.create(
model="gpt-5",
reasoning={"effort": "medium"},
input=[
{
"role": "developer",
"content": f"{assistant_prompt}\n\n{json_instructions}"
},
{
"role": "user",
"content": user_prompt
}
]
)
try:
import json
parsed_data = json.loads(response.output_text)
return {
"structured_data": parsed_data,
"raw_output": response.output_text,
"validation_status": "valid"
}
except json.JSONDecodeError as e:
return {
"structured_data": None,
"raw_output": response.output_text,
"validation_status": f"invalid JSON: {str(e)}",
"error": str(e)
}

# Try methods in order of preference
print("🔧 Testing Structured Output Methods:")

# Try Pydantic first (most reliable)
try:
result = generate_structured_with_pydantic(gpt_assistant_prompt, gpt_user_prompt)
print("✅ Success with Pydantic!")
structured_result = result
except:
print("⚠️ Pydantic failed, trying JSON schema...")
# Try JSON schema
try:
result = generate_structured_with_json_schema(gpt_assistant_prompt, gpt_user_prompt)
print("✅ Success with JSON schema!")
structured_result = result
except:
print("⚠️ JSON schema failed, using prompt engineering fallback...")
# Fallback to prompt engineering
structured_result = generate_structured_fallback(gpt_assistant_prompt, gpt_user_prompt)
print("📝 Using prompt engineering approach")

print(f"\nFinal structured result:")
if structured_result.get("structured_data"):
print(structured_result["structured_data"])
print(f"Status: {structured_result['validation_status']}")
else:
print("Failed to generate valid structured output")
print(f"Raw output: {structured_result['raw_output']}")
Which gives us:

And in Weave:


Why Use Structured Outputs?

Methods 1 & 2 (Native Structured Outputs) vs Method 3 (Prompt Engineering):
Native Structured Outputs give you:
  • Guaranteed results: The AI must follow your exact schema - no parsing errors or missing fields
  • No validation needed: Skip the hassle of checking if the JSON is valid or complete
  • Clear error handling: Safety refusals are clearly marked, not hidden in malformed responses
Simple API Usage:
  • Method 1: client.responses.parse(text_format=YourPydanticModel)
  • Method 2: client.responses.create(text={"format": {"type": "json_schema", ...}})
Key Rules:
  • All fields must be required (use union with null for optional fields
  • Always include "additionalProperties": fals
  • Schemas can't exceed 5000 properties or 10 nesting levels
Bottom line: Native structured outputs eliminate the guesswork and guarantee your assistant returns exactly the data structure your application needs, every time.

Step 9: Image generation with GPT-5

This is maybe my favorite part (who doesn't like generating images?) As with some previous models, GPT-5 includes built-in image generation capabilities, which means you can create both text and visual content using the same model. Let's take the content we've already generated in previous steps and create an accompanying image for it.
We'll use the structured output from Step 8 and have your assistant create a visual representation:
import base64
from PIL import Image
import io

def generate_image_with_gpt5(prompt: str) -> dict:
"""Generate an image using GPT-5's built-in image generation via tools"""
try:
response = client.responses.create(
model="gpt-5",
input=prompt,
tools=[{"type": "image_generation"}],
)
# Extract the image data from the response
image_data = [
output.result
for output in response.output
if output.type == "image_generation_call"
]
if image_data:
image_base64 = image_data[0]
# Convert base64 to PIL Image for Weave display
image_bytes = base64.b64decode(image_base64)
image = Image.open(io.BytesIO(image_bytes))
return {
"image": image, # PIL Image object for Weave
"image_base64": image_base64,
"prompt_used": prompt,
"status": "success"
}
else:
return {"error": "No image data found in response"}
except Exception as e:
return {"error": str(e)}

@weave.op()
def create_image_for_existing_content(assistant_prompt: str, existing_content: dict) -> dict:
"""Generate an image based on previously created content"""
# Extract the main response from our structured content
main_response = existing_content.get("structured_data", {}).get("response", "")
key_points = existing_content.get("structured_data", {}).get("key_points", [])
if not main_response:
return {"error": "No existing content found to create image for"}
# Step 1: Have your assistant create an image prompt based on the existing content
image_prompt_response = client.responses.create(
model="gpt-5",
reasoning={"effort": "low"},
input=[
{
"role": "developer",
"content": f"{assistant_prompt} Now act as a visual artist and create a detailed image description that would perfectly complement the content you previously created. Focus on visual elements, style, mood, and composition. Keep it under 300 characters for optimal image generation."
},
{
"role": "user",
"content": f"Create an image description for this content you generated:\n\nMain response: {main_response}\n\nKey points: {', '.join(key_points)}\n\nDescribe an image that would enhance and complement this content."
}
]
)
image_prompt = image_prompt_response.output_text
print(f"🎨 Your assistant's image prompt: {image_prompt}")
# Step 2: Generate the image using GPT-5's tool-based image generation
image_result = generate_image_with_gpt5(f"Generate an image: {image_prompt}")
return {
"original_content": existing_content,
"image_prompt": image_prompt,
"image_result": image_result,
"assistant_role": assistant_prompt,
"generated_image": image_result.get("image") if "error" not in image_result else None # PIL Image for Weave
}

# Use the structured content from Step 8 to create an accompanying image
print("🖼️ Creating an image for your previously generated content...")

# Check if we have structured content from previous steps
if 'pydantic_result' in locals() and pydantic_result.get("structured_data"):
print(f"Using content: '{pydantic_result['structured_data']['response'][:50]}...'")
multimodal_result = create_image_for_existing_content(gpt_assistant_prompt, pydantic_result)
if "error" not in multimodal_result:
print(f"\n📝 Original Content:")
print(f"'{multimodal_result['original_content']['structured_data']['response']}'")
print(f"\n🎨 Image Concept:")
print(f"'{multimodal_result['image_prompt']}'")
if "error" not in multimodal_result["image_result"]:
print(f"✅ Image generated successfully!")
# Display the image - Weave will show this in the dashboard
generated_image = multimodal_result["generated_image"]
if generated_image:
print("🖼️ Generated image will appear in Weave dashboard")
# You can also display it in Jupyter if running there
try:
from IPython.display import display
display(generated_image)
except ImportError:
pass
else:
print(f"❌ Image generation failed: {multimodal_result['image_result']['error']}")
else:
print(f"❌ Error: {multimodal_result['error']}")
Which will output:


And log to Weave:


Step 10: Comprehensive tutorial summary

For our final step, we'll create a personalized summary of everything we've accomplished in this tutorial. This is particularly interesting because we'll use the highest reasoning effort level to get the most thoughtful, comprehensive analysis of your experience.
Your custom assistant will reflect on the entire session - from the role you assigned it to the specific request you made - and provide insights about how GPT-5's various features performed in your particular use case.
@weave.op()
def create_personalized_tutorial_summary(assistant_prompt: str, user_prompt: str) -> dict:
"""Generate a personalized summary of your GPT-5 tutorial experience"""
summary_request = f"""
Based on our tutorial session where you were '{assistant_prompt}' and I asked you to '{user_prompt}',
please summarize what we accomplished with GPT-5's features:
1. Basic Responses API with custom role prompts
2. Reasoning effort level comparisons (low/medium/high)
3. Structured JSON output generation
4. Multimodal content creation with image generation
5. Advanced prompt engineering techniques
Reflect on how your specific role affected the outputs and what developers can learn.
"""
response = client.responses.create(
model="gpt-5",
reasoning={"effort": "high"}, # Use high effort for final summary
input=[
{
"role": "developer",
"content": f"{assistant_prompt} Provide a thoughtful, comprehensive summary with insights about the tutorial experience."
},
{
"role": "user",
"content": summary_request
}
]
)
return {
"personalized_summary": response.output_text,
"original_assistant_role": assistant_prompt,
"original_user_request": user_prompt,
"tutorial_features": [
"Custom role-based prompting",
"Responses API usage",
"Reasoning effort controls",
"Structured JSON outputs",
"Image generation integration",
"Multimodal content creation",
"Comprehensive Weave logging"
]
}

final_summary = create_personalized_tutorial_summary(gpt_assistant_prompt, gpt_user_prompt)
print("🎯 Your Personalized Tutorial Summary:")
print(final_summary["personalized_summary"])

print(f"\n📊 Session Details:")
print(f"Your Assistant Role: {final_summary['original_assistant_role']}")
print(f"Your Request: {final_summary['original_user_request']}")
print(f"Features Explored: {', '.join(final_summary['tutorial_features'])}")
Which gives us:

And logged to Weave:


Step 11: Reviewing your work in Weave

After running through this tutorial or working on your own projects, your Weave dashboard becomes a comprehensive laboratory notebook. Every function decorated with @weave.op() has been automatically logged, creating a rich dataset of your GPT-5 experimentation.
In the dashboard, you'll be able to see:
  • Operation traces: Every API call with complete input/output logging
  • Performance metrics: Token usage, latency, and reasoning effort costs across different approaches
  • Side-by-side comparisons: How your assistant performed with low vs medium vs high reasoning effort
  • Multimodal tracking: Text generation calls alongside image creation attempts
  • Error handling: Any failed API calls and their error messages
  • Structured data validation: JSON outputs with schema compliance verification
This creates an invaluable resource for understanding how to optimize GPT-5 for your specific use cases, comparing costs and performance across different approaches, and sharing your findings with team members.

Conclusion

You've just mastered GPT-5's most powerful capabilities, from reasoning effort controls that optimize performance and cost, to guaranteed structured outputs that eliminate parsing headaches, to seamless multimodal content creation that brings ideas to life visually. You've:
  • Built a custom AI assistant with your own personality and expertise
  • Generated perfectly structured JSON that never breaks your application
  • Created images that intelligently complement your text content
  • Compared reasoning effort levels to balance quality with speed and cost
  • Logged everything in Weave for complete visibility and collaboration
Now it's time to:
  • Integrate these into your production applications
  • Experiment with different assistant roles for your specific use cases
  • Use structured outputs to build reliable AI-powered features
  • Leverage reasoning controls to optimize your API costs
And W&B Weave is here to help.
GPT-5 isn't just another model update, it's a fundamental shift toward more reliable, controllable, and cost-effective AI development. The multimodal workflows and guaranteed outputs you've learned here will become the foundation of next-generation applications.
Let me know if the comments below what great things you plan to build next!

Iterate on AI agents and models faster. Try Weights & Biases today.