GPT-5 Python quickstart using the OpenAI API
Getting set up and running GPT-5 on your machine in Python using the OpenAI API.
Created on August 1|Last edited on August 8
Comment
The moment GPT-5 launched, we knew we had to dive in and play with some of the new API capabilities. GPT-5 introduces reasoning effort controls, native structured outputs, built-in image generation, and seamless multimodal workflows that could really change how we build AI applications.
In this tutorial, you'll learn about GPT-5's most powerful features in under 15 minutes (seriously). We'll explore the new Responses API, create guaranteed JSON outputs with Pydantic, generate images that perfectly complement your text, and compare different reasoning effort levels, all while logging everything with W&B Weave for complete visibility into your AI workflow. It's also available on GitHub here.
By the end, you should have some fresh ideas about building or upgrading your production-ready applications. We'll talk about some of the improvements in GPT-5 up front but if you're eager to get straight to the code, just click the button below.
Jump to the tutorial
Table Of Contents
Getting started with the GPT-5 APICreating your OpenAI API keyOpenAI pricing for GPT-5W&B Weave Tutorial: Getting started with GPT-5Step 1: The OpenAI GPT-5 API keyStep 2: Installing OpenAI and W&B WeaveStep 3: Import libraries and pass the OpenAI API keyStep 4: Crafting your first GPT-5 callStep 5: Text generation with the GPT-5 Responses APIViewing GPT-5 inputs and outputs in W&B WeaveStep 7: GPT-5 reasoning control - Low, Medium, and High effort testingStep 8: Structured JSON output with your custom assistantStep 9: Image generation with GPT-5Step 10: Comprehensive tutorial summaryStep 11: Reviewing your work in WeaveConclusionRelated Reading
If you're just getting started and don't yet have your machine set up to run Python, I've created a quick tutorial here that will have you up-and-running in just a few minutes.
💡
Once you're set up, you're going to need an OpenAI API. So let's start there:
Getting started with the GPT-5 API
Before you can use GPT-5 via the API, you'll need a valid API key. Just head to OpenAI’s product page and click "Start building" to begin the signup flow.

After signing up, verify your email and phone number to unlock the dashboard. That's where you can manage your projects and keys.
Creating your OpenAI API key
In the dashboard, click the current project name (e.g., Default project) in the top-left.

Once you have a project, go to settings:

And then to your API keys:

Inside your project, navigate to API keys and click "Create key." You'll be asked to give it a name.

Don't save your key unless you have someone highly secure (a notepad doc on your desktop is not). Once you're used it momentarily, close the notepad doc without saving.
💡
OpenAI pricing for GPT-5
You get a five dollar credit upon signing up, which will get you surprisingly far if you're just experimenting! As of this writing (August 7, 2025) here's the pricing for GPT-5:

W&B Weave
With W&B Weave, every call to your decorated functions is captured automatically—inputs, outputs, execution metadata, and environment details are all recorded without extra code. Just add the @weave.op() decorator to your generation function, and Weave will handle the rest. After execution, visit the Weave dashboard to explore interactive traces: view prompts alongside model responses, examine timing and resource usage, compare runs side by side, and annotate or share results with your team, all in one unified interface.
Tutorial: Getting started with GPT-5
You can follow along in Jupyter Notebook, Colab, or any Python REPL. If you need a quick Python setup, see this tutorial. Otherwise, let’s jump into the code.
One of the great things about Jupyter Notebooks is that, like Colabs, you can add your comments and context in markdown fields:

Step 1: The OpenAI GPT-5 API key
Begin by exporting your OpenAI API key as an environment variable:
%env OPENAI_API_KEY=KEY
Replace KEY with the value yours. When you run this in Jupyter, it will echo back the key, confirming it’s active.

Step 2: Installing OpenAI and W&B Weave
This installs both the official OpenAI client and Weights & Biases Weave.
This is a good time to sign up for Weights & Biases, to save you having to interrupt your workflow in a couple minutes, when you're further along.
!pip install openai weave
and run the cell.
Wait for the cell to finish—the [*] indicator will turn into a number.
💡

At this point we've installed it but we still need to import it for use
If you're new to Python, basically when we install a library we simply grab the code. When we import it, we make that library available for use.
I read a great explanation once. Installing is like uploading an image to a server. Importing is like embedding that image on a page once it's available.
💡
Step 3: Import libraries and pass the OpenAI API key
Next, load your packages and set up both Weave and the OpenAI client:
import osimport weavefrom openai import OpenAI# Initialize Weaveweave.init('gpt-5-tutorial')# Initialize OpenAI clientclient = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
Step 4: Crafting your first GPT-5 call
Prompt design remains critical. Start by defining an assistant role and the user’s request:
gpt_assistant_prompt = "You are a " + input ("Who should I be, as I answer your prompt?")gpt_user_prompt = input ("What do you want me to do?")

Step 5: Text generation with the GPT-5 Responses API
GPT-5 introduces several powerful features:
- Responses API: A more flexible way to interact with models
- Reasoning control: Adjust computational effort for complex problems
- Enhanced role system: Developer, user, and assistant message roles
- Structured outputs: Generate valid JSON conforming to schemas
- Image generation: Built-in image creation capabilities
The new Responses API offers more control and better structured outputs than previous GPT's.
Additionally, the functions below use the @weave.op() decorator so Weave will log everything.
@weave.op()def generate_with_custom_prompts(assistant_prompt: str, user_prompt: str, reasoning_effort: str = "low") -> dict:"""Generate text using your custom prompts with GPT-5's Responses API"""response = client.responses.create(model="gpt-5",reasoning={"effort": reasoning_effort},input=[{"role": "developer","content": assistant_prompt},{"role": "user","content": user_prompt}])return {"output_text": response.output_text,"assistant_role": assistant_prompt,"user_request": user_prompt,"reasoning_effort": reasoning_effort}# Use your custom promptsbasic_result = generate_with_custom_prompts(gpt_assistant_prompt, gpt_user_prompt, "medium")print("Generated response:", basic_result["output_text"])
When run, these appear as:

You'll notice we passed some additional parameters with the request. These are passes to Weave for logging, and can be easily compared with other prompts and configurations.

Viewing GPT-5 inputs and outputs in W&B Weave
After running the final cell, your notebook output will include links to each Trace. Clicking a link opens the Weave dashboard, where you can inspect inputs, outputs, code versions, execution time, and peer comments—all in rich visual form.

You'll see something like the following:

You can also click the tabs in the top right to get additional details including:
The code that generated that Trace:

Feedback on the Trace from your peers or other.
And a summary of the environment that was running:
Step 7: GPT-5 reasoning control - Low, Medium, and High effort testing
Now we'll explore one of GPT-5's most powerful features: reasoning effort control. This allows you to adjust how much computational power the model dedicates to thinking through your request. Think of it like asking someone to give you a quick answer versus asking them to really think deeply about the problem.
The three effort levels work differently:
- Low effort: Fast and efficient, good for straightforward tasks
- Medium effort: Balanced approach that works well for most use cases
- High effort: Deep thinking for complex problems that require careful reasoning
Let's see how our custom assistant performs at each level:
@weave.op()def generate_with_reasoning_levels(assistant_prompt: str, user_prompt: str) -> dict:"""Compare your prompts across different reasoning effort levels"""results = {}efforts = ["low", "medium", "high"]for effort in efforts:print(f"Generating with {effort} reasoning effort...")response = client.responses.create(model="gpt-5",reasoning={"effort": effort},input=[{"role": "developer","content": assistant_prompt},{"role": "user","content": user_prompt}])results[effort] = {"output": response.output_text,"effort_level": effort}return results# Compare reasoning levels with your promptsreasoning_comparison = generate_with_reasoning_levels(gpt_assistant_prompt, gpt_user_prompt)print("\n🧠 Your Prompts with Different Reasoning Levels:")for effort, result in reasoning_comparison.items():print(f"\n--- {effort.upper()} EFFORT ---")print(result["output"])print("-" * 50)
We now get our results for all three reasoning levels sent to Weave, where we can keep them logged for easy comparison:

You'll notice that higher effort levels typically produce more thorough, nuanced responses, but they take longer and cost more. This is particularly useful when your assistant role requires deep expertise or complex problem-solving.
Step 8: Structured JSON output with your custom assistant
One of GPT-5's breakthrough features is Structured Outputs - the ability to guarantee that the model's response will be valid JSON that conforms to a specific schema. This is incredibly powerful for applications that need reliable, parseable data.
The OpenAI Responses API supports structured outputs using the text_format parameter with Pydantic models or JSON schemas. Let's explore both approaches:
Method 1: Using Pydantic models
Uses OpenAI's native structured outputs with Python type definitions for guaranteed schema compliance.
First, let's define our Pydantic model and function separately to avoid any parsing issues:
# Import required librariesfrom pydantic import BaseModelfrom typing import List# Define the Pydantic modelclass AssistantResponse(BaseModel):response: strkey_points: List[str]confidence_level: strfollow_up_questions: List[str]# Define the function - simplified for Weave compatibilitydef generate_structured_with_pydantic_simple(assistant_prompt, user_prompt):"""Generate structured output using Pydantic models - Weave-compatible version"""response = client.responses.parse(model="gpt-5",input=[{"role": "developer", "content": assistant_prompt},{"role": "user", "content": user_prompt}],text_format=AssistantResponse)result = {"structured_data": response.output_parsed.model_dump(),"raw_output": response.output_text,"assistant_role": assistant_prompt,"validation_status": "valid"}return result# Alternative: Manual logging approach to avoid Weave parsing issuesdef log_and_call_pydantic(assistant_prompt, user_prompt):"""Manually log the call to avoid Weave parsing errors"""print(f"📝 Calling Pydantic structured output...")print(f" Assistant: {assistant_prompt[:50]}...")print(f" User: {user_prompt[:50]}...")result = generate_structured_with_pydantic_simple(assistant_prompt, user_prompt)print(f"✅ Success! Generated {len(result['structured_data'])} structured fields")return result# Test with Pydantic approachprint("Testing Pydantic structured output...")try:pydantic_result = log_and_call_pydantic(gpt_assistant_prompt, gpt_user_prompt)print("✅ Pydantic Structured Output:")print(pydantic_result["structured_data"])print(f"Validation Status: {pydantic_result['validation_status']}")except Exception as e:print(f"❌ Pydantic method failed: {e}")print("Falling back to JSON schema method...")
Which will print:

Method 2: Using JSON schema
Uses OpenAI's native structured outputs with explicit JSON schema definitions for precise control.
def generate_structured_with_json_schema(assistant_prompt, user_prompt):"""Generate structured output using JSON schema"""schema = {"type": "object","properties": {"response": {"type": "string","description": "Main response from the assistant"},"key_points": {"type": "array","items": {"type": "string"},"description": "Key points or takeaways"},"confidence_level": {"type": "string","enum": ["low", "medium", "high"],"description": "Assistant's confidence in the response"},"follow_up_questions": {"type": "array","items": {"type": "string"},"description": "Suggested follow-up questions"}},"additionalProperties": False,"required": ["response", "key_points", "confidence_level", "follow_up_questions"]}response = client.responses.create(model="gpt-5",input=[{"role": "developer","content": assistant_prompt},{"role": "user","content": user_prompt}],text={"format": {"type": "json_schema","name": "assistant_response", # This was missing!"strict": True,"schema": schema}})import jsonparsed_data = json.loads(response.output_text)return {"structured_data": parsed_data,"raw_output": response.output_text,"assistant_role": assistant_prompt,"schema_used": schema,"validation_status": "valid"}# Test with JSON schema approachprint("Testing JSON schema structured output...")try:schema_result = generate_structured_with_json_schema(gpt_assistant_prompt, gpt_user_prompt)print("✅ JSON Schema Structured Output:")print(schema_result["structured_data"])print(f"Validation Status: {schema_result['validation_status']}")except Exception as e:print(f"❌ JSON Schema method failed: {e}")print("This may be due to API differences - the Pydantic method above is more reliable.")
Which yields:

Or in Weave:

Method 3: Fallback with prompt engineering
Uses careful prompting and validation when native structured outputs aren't available or working.
@weave.op()def generate_structured_fallback(assistant_prompt: str, user_prompt: str) -> dict:"""Fallback method using careful prompt engineering"""json_instructions = '''You must respond with valid JSON in exactly this format:{"response": "your main response here","key_points": ["point 1", "point 2", "point 3"],"confidence_level": "low|medium|high","follow_up_questions": ["question 1", "question 2"]}Output ONLY the JSON, no additional text.'''response = client.responses.create(model="gpt-5",reasoning={"effort": "medium"},input=[{"role": "developer","content": f"{assistant_prompt}\n\n{json_instructions}"},{"role": "user","content": user_prompt}])try:import jsonparsed_data = json.loads(response.output_text)return {"structured_data": parsed_data,"raw_output": response.output_text,"validation_status": "valid"}except json.JSONDecodeError as e:return {"structured_data": None,"raw_output": response.output_text,"validation_status": f"invalid JSON: {str(e)}","error": str(e)}# Try methods in order of preferenceprint("🔧 Testing Structured Output Methods:")# Try Pydantic first (most reliable)try:result = generate_structured_with_pydantic(gpt_assistant_prompt, gpt_user_prompt)print("✅ Success with Pydantic!")structured_result = resultexcept:print("⚠️ Pydantic failed, trying JSON schema...")# Try JSON schematry:result = generate_structured_with_json_schema(gpt_assistant_prompt, gpt_user_prompt)print("✅ Success with JSON schema!")structured_result = resultexcept:print("⚠️ JSON schema failed, using prompt engineering fallback...")# Fallback to prompt engineeringstructured_result = generate_structured_fallback(gpt_assistant_prompt, gpt_user_prompt)print("📝 Using prompt engineering approach")print(f"\nFinal structured result:")if structured_result.get("structured_data"):print(structured_result["structured_data"])print(f"Status: {structured_result['validation_status']}")else:print("Failed to generate valid structured output")print(f"Raw output: {structured_result['raw_output']}")
Which gives us:

And in Weave:

Why Use Structured Outputs?
Methods 1 & 2 (Native Structured Outputs) vs Method 3 (Prompt Engineering):
Native Structured Outputs give you:
- Guaranteed results: The AI must follow your exact schema - no parsing errors or missing fields
- No validation needed: Skip the hassle of checking if the JSON is valid or complete
- Clear error handling: Safety refusals are clearly marked, not hidden in malformed responses
Simple API Usage:
- Method 1: client.responses.parse(text_format=YourPydanticModel)
- Method 2: client.responses.create(text={"format": {"type": "json_schema", ...}})
Key Rules:
- All fields must be required (use union with null for optional fields
- Always include "additionalProperties": fals
- Schemas can't exceed 5000 properties or 10 nesting levels
Bottom line: Native structured outputs eliminate the guesswork and guarantee your assistant returns exactly the data structure your application needs, every time.
Step 9: Image generation with GPT-5
This is maybe my favorite part (who doesn't like generating images?) As with some previous models, GPT-5 includes built-in image generation capabilities, which means you can create both text and visual content using the same model. Let's take the content we've already generated in previous steps and create an accompanying image for it.
We'll use the structured output from Step 8 and have your assistant create a visual representation:
import base64from PIL import Imageimport iodef generate_image_with_gpt5(prompt: str) -> dict:"""Generate an image using GPT-5's built-in image generation via tools"""try:response = client.responses.create(model="gpt-5",input=prompt,tools=[{"type": "image_generation"}],)# Extract the image data from the responseimage_data = [output.resultfor output in response.outputif output.type == "image_generation_call"]if image_data:image_base64 = image_data[0]# Convert base64 to PIL Image for Weave displayimage_bytes = base64.b64decode(image_base64)image = Image.open(io.BytesIO(image_bytes))return {"image": image, # PIL Image object for Weave"image_base64": image_base64,"prompt_used": prompt,"status": "success"}else:return {"error": "No image data found in response"}except Exception as e:return {"error": str(e)}@weave.op()def create_image_for_existing_content(assistant_prompt: str, existing_content: dict) -> dict:"""Generate an image based on previously created content"""# Extract the main response from our structured contentmain_response = existing_content.get("structured_data", {}).get("response", "")key_points = existing_content.get("structured_data", {}).get("key_points", [])if not main_response:return {"error": "No existing content found to create image for"}# Step 1: Have your assistant create an image prompt based on the existing contentimage_prompt_response = client.responses.create(model="gpt-5",reasoning={"effort": "low"},input=[{"role": "developer","content": f"{assistant_prompt} Now act as a visual artist and create a detailed image description that would perfectly complement the content you previously created. Focus on visual elements, style, mood, and composition. Keep it under 300 characters for optimal image generation."},{"role": "user","content": f"Create an image description for this content you generated:\n\nMain response: {main_response}\n\nKey points: {', '.join(key_points)}\n\nDescribe an image that would enhance and complement this content."}])image_prompt = image_prompt_response.output_textprint(f"🎨 Your assistant's image prompt: {image_prompt}")# Step 2: Generate the image using GPT-5's tool-based image generationimage_result = generate_image_with_gpt5(f"Generate an image: {image_prompt}")return {"original_content": existing_content,"image_prompt": image_prompt,"image_result": image_result,"assistant_role": assistant_prompt,"generated_image": image_result.get("image") if "error" not in image_result else None # PIL Image for Weave}# Use the structured content from Step 8 to create an accompanying imageprint("🖼️ Creating an image for your previously generated content...")# Check if we have structured content from previous stepsif 'pydantic_result' in locals() and pydantic_result.get("structured_data"):print(f"Using content: '{pydantic_result['structured_data']['response'][:50]}...'")multimodal_result = create_image_for_existing_content(gpt_assistant_prompt, pydantic_result)if "error" not in multimodal_result:print(f"\n📝 Original Content:")print(f"'{multimodal_result['original_content']['structured_data']['response']}'")print(f"\n🎨 Image Concept:")print(f"'{multimodal_result['image_prompt']}'")if "error" not in multimodal_result["image_result"]:print(f"✅ Image generated successfully!")# Display the image - Weave will show this in the dashboardgenerated_image = multimodal_result["generated_image"]if generated_image:print("🖼️ Generated image will appear in Weave dashboard")# You can also display it in Jupyter if running theretry:from IPython.display import displaydisplay(generated_image)except ImportError:passelse:print(f"❌ Image generation failed: {multimodal_result['image_result']['error']}")else:print(f"❌ Error: {multimodal_result['error']}")
Which will output:

And log to Weave:

Step 10: Comprehensive tutorial summary
For our final step, we'll create a personalized summary of everything we've accomplished in this tutorial. This is particularly interesting because we'll use the highest reasoning effort level to get the most thoughtful, comprehensive analysis of your experience.
Your custom assistant will reflect on the entire session - from the role you assigned it to the specific request you made - and provide insights about how GPT-5's various features performed in your particular use case.
@weave.op()def create_personalized_tutorial_summary(assistant_prompt: str, user_prompt: str) -> dict:"""Generate a personalized summary of your GPT-5 tutorial experience"""summary_request = f"""Based on our tutorial session where you were '{assistant_prompt}' and I asked you to '{user_prompt}',please summarize what we accomplished with GPT-5's features:1. Basic Responses API with custom role prompts2. Reasoning effort level comparisons (low/medium/high)3. Structured JSON output generation4. Multimodal content creation with image generation5. Advanced prompt engineering techniquesReflect on how your specific role affected the outputs and what developers can learn."""response = client.responses.create(model="gpt-5",reasoning={"effort": "high"}, # Use high effort for final summaryinput=[{"role": "developer","content": f"{assistant_prompt} Provide a thoughtful, comprehensive summary with insights about the tutorial experience."},{"role": "user","content": summary_request}])return {"personalized_summary": response.output_text,"original_assistant_role": assistant_prompt,"original_user_request": user_prompt,"tutorial_features": ["Custom role-based prompting","Responses API usage","Reasoning effort controls","Structured JSON outputs","Image generation integration","Multimodal content creation","Comprehensive Weave logging"]}final_summary = create_personalized_tutorial_summary(gpt_assistant_prompt, gpt_user_prompt)print("🎯 Your Personalized Tutorial Summary:")print(final_summary["personalized_summary"])print(f"\n📊 Session Details:")print(f"Your Assistant Role: {final_summary['original_assistant_role']}")print(f"Your Request: {final_summary['original_user_request']}")print(f"Features Explored: {', '.join(final_summary['tutorial_features'])}")
Which gives us:

And logged to Weave:

Step 11: Reviewing your work in Weave
After running through this tutorial or working on your own projects, your Weave dashboard becomes a comprehensive laboratory notebook. Every function decorated with @weave.op() has been automatically logged, creating a rich dataset of your GPT-5 experimentation.
In the dashboard, you'll be able to see:
- Operation traces: Every API call with complete input/output logging
- Performance metrics: Token usage, latency, and reasoning effort costs across different approaches
- Side-by-side comparisons: How your assistant performed with low vs medium vs high reasoning effort
- Multimodal tracking: Text generation calls alongside image creation attempts
- Error handling: Any failed API calls and their error messages
- Structured data validation: JSON outputs with schema compliance verification
This creates an invaluable resource for understanding how to optimize GPT-5 for your specific use cases, comparing costs and performance across different approaches, and sharing your findings with team members.
Conclusion
You've just mastered GPT-5's most powerful capabilities, from reasoning effort controls that optimize performance and cost, to guaranteed structured outputs that eliminate parsing headaches, to seamless multimodal content creation that brings ideas to life visually. You've:
- Built a custom AI assistant with your own personality and expertise
- Generated perfectly structured JSON that never breaks your application
- Created images that intelligently complement your text content
- Compared reasoning effort levels to balance quality with speed and cost
- Logged everything in Weave for complete visibility and collaboration
Now it's time to:
- Integrate these into your production applications
- Experiment with different assistant roles for your specific use cases
- Use structured outputs to build reliable AI-powered features
- Leverage reasoning controls to optimize your API costs
And W&B Weave is here to help.
GPT-5 isn't just another model update, it's a fundamental shift toward more reliable, controllable, and cost-effective AI development. The multimodal workflows and guaranteed outputs you've learned here will become the foundation of next-generation applications.
Let me know if the comments below what great things you plan to build next!
Related Reading
Attribute-Value Extraction With GPT-3 and Weights & Biases
In this article, we learn how to fine-tune OpenAI's GPT-3 for attribute-value extraction from products, looking at the challenges and how to overcome them.
Automating Change Log Tweets with Few-Shot Learning and GPT-3
A text summarization recipe using OpenAI GPT-3's few-shot learning and Weights & Biases
o1 model Python quickstart using the OpenAI API
Getting set up and running the new o1 models in Python using the OpenAI API. We'll be working with o1-preview.
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.