
Every developer building AI agents faces the same question: how should my agent consume tools?
The obvious answer used to be "just use function calling." Define schemas inline, let the model pick tools, handle the results. Simple enough for a demo. Then you need approval workflows, tool filtering, output transformations, and suddenly you're rebuilding infrastructure that should already exist.
MCP promised to standardize this. One protocol, any client. Connect to any MCP server and discover tools dynamically. A year later, 10,000+ MCP servers exist and adoption includes Claude, ChatGPT, Cursor, VS Code, and Copilot.
But the promise doesn't match reality for everyone. Framework support varies dramatically. Vercel AI SDK's MCP integration can't deploy stdio transport to production. Missing features include session management, resumable streams, and approval workflows. Security and multi-tenant isolation? You're on your own.
So how should you actually consume tools in your agent?
There are three ways to give your agent access to tools. Each optimizes for different constraints.
Pure MCP Client: Connect to MCP servers, discover tools dynamically. The protocol handles discovery, you handle everything else.
Pure SDK (Function Calling): Import tools as functions directly in your codebase. Define schemas with JSON Schema or Zod. No external dependencies.
Hybrid Toolset SDK: Libraries like StackOne that pull tool definitions dynamically but wrap them with utilities the protocol doesn't provide (filtering, approvals, framework adapters, output transformations).
The right choice depends on how many integrations you need, whether you need runtime flexibility, and what utilities you can't build yourself.
Connecting to MCP servers gives you access to thousands of pre-built integrations. Any MCP-compatible server exposes tools your agent can discover at runtime.
Remote MCP servers like Sentry's handle OAuth, scaling, and state management. Local stdio servers work for development but can't deploy to production in most frameworks.
✓ Pros
✗ Cons
The framework variance issue is real. We've seen teams spend days working around the gap between what MCP specifies and what their framework actually implements. Session management, notifications, and resumable streams exist in the spec but not consistently in SDKs.
The frustration isn't theoretical. Developers report:
"Yeah MCP is the worst documented technology I have ever encountered... I have read so much about MCP and have zero fucking clue except vague marketing speak." — epistasis, Hacker News
"This is a stunningly bad specification, perhaps the worst I have seen in my career." — jes5199, Hacker News
Heavy MCP setups can consume 55-100k+ tokens before you type a word. That's 27-50% of a 200k context window gone on tool definitions alone. — Mario Giancini, "The Hidden Cost of MCP Servers"
And on client-side implementation issues specifically:
MCP servers don't reload when code changes. Must fully restart Claude Code CLI, losing entire conversation context every time. This has been broken for 6+ months. — GitHub Issue #7174
Server support is fine. Client implementations lag behind the spec. Custom headers, OAuth flows, and session management work differently (or not at all) across frameworks.
If your tools are for broad ecosystem consumption (open source, community contribution, cross-platform) and you can live with lowest-common-denominator features, pure MCP works.
When you control both the agent and the tools, SDK-based function calling gives you type safety, embedded utilities, and no network overhead.
✓ Pros
✗ Cons
Pure SDK makes sense for single-framework projects with custom tools. When you're building internal tools for your specific agent, the overhead of running an MCP server doesn't pay off.
The hybrid approach recognizes that MCP's value is in discovery and standardization, but the protocol has real gaps that SDKs fill better.
The hybrid approach pulls tool definitions from an MCP-style endpoint but wraps them with utilities the protocol doesn't provide.
✓ Pros
✗ Cons
Filtering tools by pattern and provider dramatically reduces token costs. Here's how it works:
You've seen the three approaches. But whichever you choose, your framework probably doesn't handle everything you need. SDKs vary wildly in what utilities they provide.
The same feature works differently (or not at all) across frameworks. Here's what each framework actually provides across the features that matter most for production agents:
| Framework | MCP Support | Approvals | Output Transforms | File Handling | Execution Hooks |
|---|---|---|---|---|---|
| OpenAI Agents SDK | ✓ | ✗ | ✗ | ✗ | ~ |
| Anthropic Claude SDK | ✓ | ~ | ✗ | ✗ | ✓ |
| Vercel AI SDK | ~ | ✗ | ✗ | ✗ | ~ |
| LangChain/LangGraph | ✓ | ~ | ~ | ~ | ✓ |
| Pydantic AI | ✓ | ✗ | ✗ | ✗ | ~ |
| StackOne | ~ | ✓ | ✓ | ✓ | ✓ |
The gaps below are what you'll need to build yourself, or find a toolset SDK that provides them.
MCP has no concept of human-in-the-loop. The protocol says "here's a tool, here's how to call it." What it doesn't say: should this call require approval? From whom? Under what conditions?
Tools return data. Sometimes that data is massive (file contents, search results, API responses). Raw MCP sends everything to the model, burning tokens on data the agent doesn't need to see in full.
The SDK can split results: full data for your app's UI, summarized data for the model. Your user sees the complete search results while the agent only consumes tokens for what it needs to reason about.
MCP doesn't specify how to handle file downloads, binary data, or content that needs encoding. A tool that returns a PDF gets you a URL. But the model can't read PDFs from URLs.
SDKs handle the complexity: fetching URLs, detecting formats, extracting text from PDFs, encoding images for vision models, parsing spreadsheets into structured data. Without this, you're writing format-specific handlers for every file type your tools might return.
Think Express middleware or DOM event handlers. Before any tool executes, your code runs first. You can log, check permissions, detect prompt injection, sanitize inputs, or block execution entirely.
This is critical for security. The agent might call file_write with a path from user input that contains ../../../etc/passwd. Or search_docs with a query that says "ignore previous instructions." Pattern matching catches obvious attacks. Semantic similarity (embeddings compared against known attack vectors) catches the subtle ones. Without hooks, you're trusting third-party tools completely. With hooks, you control every call.
MCP has no equivalent. It's a protocol for tool discovery, not execution control. Hooks are SDK-level middleware that frameworks provide but the protocol doesn't specify.
MCP isn't going away. Linux Foundation governance means it'll evolve slowly but stay stable. The spec will eventually address multi-tenancy, better auth, and the other gaps.
Our prediction: two or three agent frameworks will consolidate to cover 70% of the market with solid MCP support. The winners will be Claude's Agent SDK and Pydantic AI, not OpenAI's Agents SDK. Anthropic has the model quality lead and is building the tooling to match. Pydantic AI has the Pythonic simplicity that LangChain never achieved. Vercel AI SDK will own the TypeScript/Next.js niche. OpenAI's SDK feels like the safe choice that loses to teams with stronger opinions about developer experience.
But hybrid toolset SDKs will continue to provide capabilities that harnesses don't prioritize. Things like tool filtering with glob patterns, conditional approval workflows, output transformations that reduce token costs. These are features specific to multi-tenant SaaS that general-purpose agent frameworks won't optimize for.
Multi-tenant isolation is the blind spot. Every framework in the comparison table assumes single-user contexts. Account-scoped tool visibility, per-tenant rate limits, audit logs per customer: you're building that yourself.
Pure MCP servers will be for ecosystem contribution. Pure SDKs will be for internal tools. Hybrid approaches will power production SaaS features where you need both dynamic discovery and the utilities that make tools safe for real users.
There's a fourth approach emerging that we haven't covered here: Code Mode (or "code execution" in some frameworks). Instead of pre-defined tools, the agent writes and executes code to accomplish tasks. Cloudflare's workers-ai, OpenAI's code interpreter, and several frameworks are experimenting with this pattern. It deserves its own post.