The Model Context Protocol (MCP): A guide for AI integration

This guide explores how MCP standardizes AI interactions with external tools and data sources, enabling more efficient AI context integrations.
Brett Young
Created on March 17|Last edited on March 24
Comment
﻿The Model Context Protocol (MCP) is an open standard that streamlines how large language model (LLM) applications interact with external tools, structured data, and predefined prompts. By providing a unified interface, MCP eliminates the inefficiencies of custom integrations, enabling AI systems to seamlessly access and execute external resources.
Traditional LLM applications often rely on ad-hoc connections to retrieve external information, resulting in inconsistencies, development overhead, and lack of interoperability. MCP solves this by introducing a standardized protocol that allows multiple AI applications to share a common infrastructure for managing external data and executing tasks.
Introduced by Anthropic in November 2024, MCP was designed to break AI systems out of isolation, allowing models to securely access real-time context and interact with external systems. Think of MCP as a “USB port” for AI applications—a universal connection that lets any LLM-powered assistant integrate with data sources, APIs, and external tools without requiring custom code for each service.
This article explores how MCP works and provides a practical guide to building an MCP server that connects to Claude Desktop. This server enables powerful capabilities such as web search and extended reasoning with Deepseek R1. By the end, you’ll have a working MCP setup that enhances your AI assistant’s ability to interact with external data and tools in a structured, efficient way.
﻿
Table of contentsTable of contentsWhat is the Model Context Protocol (MCP)?Exploring the general architecture of MCPMCP primitives Putting it all together Why MCP isn’t just another interface: Addressing unique challengesTutorial: Building out a MCP Server for Claude Desktop Breaking Down the CodeConfiguring Claude for desktop to use the MCP serverTesting our server Conclusion 
﻿
What is the Model Context Protocol (MCP)?LLM applications such as chatbots often rely on external resources to provide accurate and dynamic responses. One way to categorize these resources is using three core primitives: 
Tools, which allow the LLM to execute actions such as querying a database or sending an email,
Resources, which provide structured data such as documents, logs, or API responses that the model can reference when generating outputs; and 
Prompts, which serve as predefined templates that guide interactions, ensuring consistency and efficiency. 
Without a standardized framework for accessing these resources, every LLM-powered application must implement its own approach to integrating external capabilities, leading to unnecessary complexity and redundant development efforts.
MCP addresses this by offering a unified protocol that standardizes how LLM applications interact with external resources. If multiple LLM applications adopt MCP, they can all leverage the same infrastructure for connecting with tools, resources, and prompts, making interoperability between different AI-driven applications much easier. 
This is particularly important as AI adoption grows. Without a common standard, organizations would have to build and maintain custom integrations for each application separately, which is inefficient and prone to errors. With MCP, any LLM-based application can interact with external data sources in a consistent manner, allowing developers to build once and deploy across multiple AI platforms. This improves scalability, reduces maintenance overhead, and ensures that AI applications can access the information they need in a secure, predictable way.
Exploring the general architecture of MCPMCP’s architecture follows a client-server model, enabling structured, context-aware interactions between LLM applications and external data sources. Unlike traditional API integrations that rely on direct function calls or database queries, MCP organizes its infrastructure into three key roles: hosts, clients, and servers, each playing a distinct part in ensuring seamless data exchange.
Host: The host is the application embedding the LLM, such as a chatbot, IDE assistant, or other software where an LLM processes user input. Rather than directly managing data retrieval and execution, the host delegates these tasks to the client, determining when to query external resources, execute tools, or use structured prompts.
Client: Acting as an intermediary between the host and MCP’s external resources, the client manages data flow and tool execution. For instance, in Claude Desktop, the MCP client selects and injects relevant context—such as documents, API results, or database records—into the LLM’s context window. It also routes tool calls, ensuring the model’s requests are executed correctly.
Server: The server provides access to resources (structured data), tools (external actions), and prompts (standardized interaction templates). An MCP server could be a local integration granting an LLM access to a filesystem, a cloud-based system retrieving real-time financial data, or a business application enabling workflow automation. The server ensures information is available when needed, in a structured format, without requiring complex queries.
﻿
By breaking down interactions into these distinct components, MCP introduces a layer of abstraction that simplifies LLM integrations. Instead of requiring each application to implement custom logic for retrieving data or executing external actions, MCP standardizes these interactions, making it easier to scale across different tools and environments. This separation also improves security and control, as the client can enforce access restrictions while ensuring that only relevant data is provided to the LLM.
MCP primitives MCP defines several primitives that structure how clients and servers interact. This article focuses primarily on server-side usage, but first, let’s briefly outline the key client-side primitives:
Roots act as controlled entry points, specifying which files, databases, or services an MCP server can access. This ensures that applications only expose necessary data sources, maintaining security and control over what information is available to LLMs.
Sampling enables iterative response generation. Instead of producing a single static output, an MCP client can request multiple completions, refining results through additional context or constraints. This enhances response quality, adaptability, and relevance in AI interactions.
On the server side, MCP introduces three primary primitives that define how LLM applications interact with external systems in a structured and predictable way:
Tools allow LLMs to execute external functions, making them the primary mechanism for performing actions beyond text-based reasoning. Unlike resources, which provide passive data, or prompts, which structure interactions, tools enable an LLM to retrieve live data, modify databases, or execute computations. For example, an LLM-powered coding assistant might run test cases, or a customer support bot might create a new support ticket. Since tools can produce side effects, they expand the LLM’s capabilities beyond static information retrieval, making AI applications more actionable.
Resources provide structured data—such as files, logs, or API responses—that an LLM can reference during generation. Unlike tools, resources do not trigger external computations. Instead, they supply relevant information that seamlessly integrates into the model’s context, reducing unnecessary complexity while ensuring consistent access to up-to-date information. For example, a financial report or system logs can be provided as a resource, automatically influencing model responses.
Prompts act as predefined templates that structure interactions without triggering external actions. Similar to “fill-in-the-blank” templates, prompts provide a framework for task-specific consistency. For example, a summarization prompt may contain predefined instructions with a placeholder for user input. Prompts can also define a chatbot’s persona, tone, or behavior, similar to Custom GPTs, shaping responses dynamically without modifying external systems.
By establishing these standardized primitives, MCP ensures that LLM-powered applications can interact with external tools and data sources efficiently, securely, and predictably.
Putting it all together Putting it together, here’s how MCP functions in practice when you ask an AI a question that requires external data or actions:
1. Capability discovery: The MCP client first asks the server to describe what it offers – i.e. it fetches the list of available tools, resources, or prompt templates that the server can provide. The AI model (via its host app) is made aware of these capabilities.
2. Augmented prompting: The user’s query (and other context) is sent to the AI model along with descriptions of the server’s tools/resources. In effect, the model now “knows” what it could do via the server. For example, if the user asks “What’s the weather tomorrow?”, the prompt to the model includes a description of a “Weather API tool” that the server exposes.
3. Tool/resource selection: The AI model analyzes the query and the available MCP tools/resources, and decides if using one is necessary. If so, it responds in a structured way (per the MCP spec) indicating which tool or resource it wants to use. In our weather example, the model might decide to call the “Weather API” tool provided by the server to get up-to-date info.
4. Server execution: The MCP client receives the model’s request and invokes the corresponding action on the MCP server (e.g. executes the weather API call through the server’s code). The server performs the action – such as retrieving data from a database or calling an external API – and then returns the result to the client.
5. Response generation: The result from the server (say, the weather forecast data) is handed back to the AI model via the client. The model can now incorporate this data into its answer. It then generates a final response to the user (e.g. “Tomorrow’s forecast is 15°C with light rain.”) based on both its own knowledge and the freshly fetched information. The user sees an answer that was enriched by the model’s ability to seamlessly pull in external info during the conversation.
Under the hood, this flow is enabled by JSON messages passing between client and server, but from a developer’s perspective MCP abstracts away the low-level details. One simply implements an MCP server following the spec (or uses a pre-built one) and an AI application that supports an MCP client can immediately leverage those new tools and data. Anthropic has provided SDKs in multiple languages (Python, TypeScript, Java/Kotlin) to make building MCP servers or integrating clients easier . For example, writing a new connector for, say, a custom SQL database involves implementing a small MCP server. The heavy lifting of how the AI and server communicate is handled by the protocol – developers just define what the server can do, in terms of prompts, resources, and tools.
Why MCP isn’t just another interface: Addressing unique challengesOne of MCP’s strongest advantages is how naturally it fits into AI workflows. Because it was created with language model agents in mind, it’s highly adaptable to how AI systems reason, retrieve information, and execute actions. Because MCP is a standard, it enables a community-driven approach to AI integrations. Developers who build MCP-compatible tools or data sources are creating components that any AI system following the protocol can immediately use. This is a major shift from traditional API integrations, where each AI tool requires its own custom-built connection to every external service.
By unifying how AI applications communicate with external data and tools, MCP removes the need for redundant work. Instead of reinventing the wheel with every integration, developers can contribute to and benefit from a shared ecosystem of pre-built, reusable components.
Traditional API-based integrations require developers to custom-build connectors for each system they want to integrate with. Because APIs are designed differently across services, every new integration requires custom authentication, request formatting, and error handling. This leads to fragmented, one-off solutions that aren’t easily reusable across different AI applications.
MCP eliminates this inefficiency by providing a standardized way for AI models to interact with external tools and data. Instead of manually writing new connectors for each system, developers can rely on a growing ecosystem of MCP-compatible tools that work out of the box. Once an AI system supports MCP, it can instantly interact with any service that follows the protocol—no additional custom coding required.
Tutorial: Building out a MCP Server for Claude Desktop Now we will implement an MCP server, which will work with the Claude Desktop app to add extra functionality. For this tutorial, we will add functional capabilities such as web search, advanced reasoning with DeepSeek R1, and structured text summarization using the prompts primitive. 
For this tutorial, we will use uv, which is a fast package manager and environment management tool for Python that simplifies dependency management and improves performance over traditional tools like pip and venv. If you haven’t installed it yet, you can do so with:
curl -LsSf https://astral.sh/uv/install.sh | sh 
Or if you are on windows: 
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
Make sure to restart your terminal afterwards to ensure that the uv command gets picked up. Now we will create a setup script for our server. You can create a file called setup.sh and add the following contents to it: 
# Create a new directory for the MCP server project
uv init server
cd server
﻿
# Create virtual environment and activate it
uv venv
source .venv/bin/activate
﻿
# Install dependencies
uv add "mcp[cli]" httpx weave openai
﻿
# Create the server file
touch server.py
For windows: 
# Create a new directory for the MCP server project
uv init server
cd server
﻿
# Create virtual environment and activate it
uv venv
.venv\Scripts\Activate
﻿
# Install dependencies
uv add "mcp[cli]" httpx weave openai
﻿
# Create the server file
New-Item -ItemType File server.py
﻿
Now run the script using the following command: 
sh -x setup.sh 
Now we can write our server.py file, which will implement the MCP server. This server will integrate with Claude Desktop to provide additional functionality, including web search, advanced reasoning with DeepSeek R1, and structured text summarization using prompts. Additionally, it will demonstrate how to expose resources, allowing the LLM to access external data in a structured way.
To implement this, open server.py in your preferred text editor and add the following code:
from typing import Any
import os
from mcp.server.fastmcp import FastMCP
from openai import OpenAI
import weave; weave.init("mcp_tools")
﻿
﻿
# Initialize FastMCP server
mcp = FastMCP("mcp_server")
﻿
# Initialize OpenAI client for web search
search_client = OpenAI(api_key="your_openai_key")
﻿
# Initialize DeepSeek R1 client
r1_api_key = "your_deepseek_key"
r1_client = OpenAI(api_key=r1_api_key, base_url="https://api.deepseek.com")
﻿
﻿
# External function for web search
@weave.op
async def run_web_search(query: str) -> str:
    try:
        response = search_client.responses.create(
            model="gpt-4o",
            tools=[{"type": "web_search_preview"}],
            input=query
        )
        return response.output_text
    except Exception as e:
        return f"Search failed: {str(e)}"
﻿
﻿
# MCP Tool: Web Search
@mcp.tool()
async def web_search(query: str) -> str:
    try:
        return await run_web_search(query)
    except Exception as e:
        return f"Search failed: {str(e)}"
﻿
﻿
# External function for R1 reasoning
@weave.op
async def run_r1_reasoning(prompt: str) -> str:
    try:
        if not r1_api_key:
            return "Error: DeepSeek API key not set."
            
        response = r1_client.chat.completions.create(
            model="deepseek-reasoner",
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message.content
    except Exception as e:
        return f"DeepSeek-R1 reasoning failed: {str(e)}"
﻿
﻿
# MCP Tool: R1 Reasoning
@mcp.tool()
async def r1_reasoning(prompt: str) -> str:
    return await run_r1_reasoning(prompt)
﻿
﻿
# External function for retrieving memory preferences
@weave.op
def get_writing_preferences() -> str:
    preferences = {
        "writing_preferences": {
            "headers": "unstyled - no bold headers in articles",
            "terminology": "use 'LLM' - everyone knows what it means",
            "avoid_words": ["crucial", "paramount", "critical", "essential", "invaluable", "In the provided script"],
            "formatting": "no bullets or lists in articles",
            "commands": {
                "--c": "only write necessary changes, not full code",
                "--q": "answer concisely and quickly",
                "--a": "write full code from beginning to end",
                "--nb": "rewrite without bullets or lists"
            },
            "absolute_dont": ["use canvas for writing code", "say 'structured workflow'"]
        }
    }
    return str(preferences)
﻿
﻿
# MCP Resource: Writing Preferences
@mcp.resource("file://memory")
def get_memory() -> str:
    return get_writing_preferences()
﻿
﻿
# External function for summarization prompt
@weave.op
def generate_summarization_prompt(text: str) -> str:
    FIXED_QUESTIONS = """
    What is the primary objective of this research?
    What methodologies or algorithms are proposed or evaluated?
    What datasets or experimental setups are used in this study?
    What are the key findings and contributions of this research?
    What are the implications of these findings for the broader field of AI?
    What limitations or challenges are acknowledged by the authors?
    What are the proposed future directions or next steps in this research?
    """
    return f"Summarize the following text by answering these questions:\n{FIXED_QUESTIONS}\n\n{text}"
﻿
﻿
# MCP Prompt: Summarization
@mcp.prompt()
def summarization_prompt(text: str) -> str:
    return generate_summarization_prompt(text)
﻿
﻿
# Run the MCP Server
if __name__ == "__main__":
    mcp.run(transport='stdio')
The server includes multiple tools, such as a web search function using OpenAI’s API, an advanced reasoning capability powered by DeepSeek R1, and a structured summarization prompt that guides the LLM in generating concise summaries based on predefined research questions.
Breaking Down the CodeInitializing MCP and external services
At the start, we initialize the FastMCP server, which acts as the central component for managing interactions between the LLM and external tools. We also set up two external clients:
OpenAI client (for web search)
DeepSeek R1 client (for advanced mathematical reasoning)
These services provide AI with real-time access to information and computational capabilities.
The web search tool
The run_web_search function is an external utility registered with @weave.op, making it observable in Weave's logging system. This function queries OpenAI's web search API and returns the result. The MCP tool web_search calls this function asynchronously, allowing the LLM to trigger real-time searches when needed.
Advanced reasoning with DeepSeek R1
Similar to web search, the run_r1_reasoning function interfaces with DeepSeek R1's API for complex mathematical and logical reasoning tasks. The corresponding MCP tool, r1_reasoning, ensures that the LLM can request structured reasoning support when it encounters complex calculations or multi-step logic.
Memory and writing preferences
The MCP resource get_memory provides a structured way for the LLM to retrieve predefined user preferences. This includes:
Formatting rules (helpful for articles)
Forbidden words and phrases (also helpful for writing) 
Special command shortcuts (--c, --q, --a, etc. - Sort of like bash aliases but for LLM's) 
Instead of manually injecting preferences into each prompt, the MCP client can retrieve this resource dynamically, ensuring consistency across interactions.
Summarization prompt
The generate_summarization_prompt function defines a fixed set of research questions, structuring the AI's response when summarizing documents. The corresponding MCP prompt, summarization_prompt, allows the LLM to standardize its approach to summarization tasks, ensuring that the generated output aligns with specific research objectives.
Additionally, we integrate Weave, which acts as a logging and observability tool to track interactions with the MCP server. By using Weave, we can monitor tool executions and data flow between Claude Desktop and the MCP server, which helps with debugging and optimization. With this implementation, our MCP server acts as a bridge between Claude Desktop and external data sources, offering structured ways for the LLM to retrieve and process information without requiring complex custom logic in each application.
Configuring Claude for desktop to use the MCP serverTo enable Claude for Desktop to communicate with your MCP server, you need to configure it to recognize the server. If you haven’t installed Claude for Desktop yet, you can download the latest version here. If you already have it installed, make sure it is updated to the latest version.
Editing the configuration file
Claude for Desktop requires a configuration file to specify which MCP servers it should connect to. You can find or create this configuration file at: ~/Library/Application Support/Claude/claude_desktop_config.json.
MacOS/Linux:
If you have VS Code installed, you can open the configuration file with:
code ~/Library/Application\ Support/Claude/claude_desktop_config.json
Windows:
code $env:APPDATA\Claude\claude_desktop_config.json
Testing our server Now that we have everything in place, we are ready to test our server. Begin by restarting your Claude Desktop app, which will connect the app to our server. To test the web search tool, we can send a query through the MCP server and observe how it retrieves relevant results. For example, a query like: "Latest updates on AI research in 2025.” After sending the query, you should see a popup which is asking permission to use the search tool we have built: 
﻿
After clicking confirm, Claude will use our search tool and retrieve results. Impressively, Claude is able to use the search tool multiple times to focus on separate sectors of AI! 
﻿
As we can see, Claude successfully uses the web search tool, and returns up to date information. Since we used Weave in our script, we can also see the results logged inside our application: 
﻿
Next, we will test DeepSeek R1 advanced reasoning tasks. This tool is ideal for handling complex mathematical problems and multi-step reasoning tasks. To demonstrate this, we will pass in a complex math problem: If a train leaves city A at 60 mph and another train leaves city B at 75 mph toward each other, when will they meet if the distance between them is 300 miles?"
﻿
﻿
Through the use of our DeepSeek R1 tool, the model is able to use R1 to calculate an answer. This shows how MCP allows us to add in some extremely powerful features like search and custom models inside Claude Desktop. Now, we will move on to testing our custom prompts and resources that we added inside our server.py file. 
You might have noticed inside Claude desktop, there are a few new icons in the chat input window, which are for our new tools and our custom prompt and resource: 
﻿
When we click the “Attach from MCP” icon, we will see a popup menu which will show our prompts and resources available. 
﻿
﻿
﻿
When we click the “summarization_prompt,” we will see a popup that shows a text field which will allow us to add our text that we would like to summarize.
﻿
After adding the text and clicking submit, we will see to a new text file attached to our chat which contains our original prompt along with out text that we pasted into the text field. 
﻿
Now, we can simply send the chat to Claude, and it will generate a summary for us! 
﻿
In addition, we can also select our memory resource to use in our chat, by selecting the memory resource in the “Attach from MCP” menu dropdown: 
﻿
This allows us to easily manage memory needed for a given chat conversation.
Conclusion MCP provides a structured, standardized framework for connecting AI models with external data sources, tools, and prompts, eliminating the need for fragmented, ad-hoc integrations. By unifying how LLM applications interact with external systems, MCP enhances scalability, security, and interoperability, making AI-powered applications more reliable and adaptable.
In this guide, we explored MCP’s architecture, detailing how hosts, clients, and servers work together to enable dynamic, context-aware AI interactions. We also examined how MCP improves upon traditional integration methods like OpenAPI and GraphQL by offering a protocol specifically optimized for LLM-driven workflows.
The tutorial demonstrated a real-world implementation of an MCP server, integrating capabilities such as web search, advanced reasoning with DeepSeek R1, and structured summarization prompts. By configuring Claude Desktop to leverage MCP, developers can seamlessly extend AI applications with custom tools, memory management, and real-time data access.
MCP is more than just another interface—it’s a foundational step toward making AI models more interactive, flexible, and practical. Whether you're enhancing chatbots, integrating enterprise workflows, or embedding AI into IoT systems, MCP provides the scalability and efficiency needed to build robust, AI-powered applications.
﻿
﻿
Add a comment
Tags: Articles, GenAI, Agents
Iterate on AI agents and models faster. Try Weights & Biases today.