
In November 2024, Anthropic open-sourced something called the Model Context Protocol. A year later, it's the de-facto standard for connecting AI models to external tools: 97 million monthly SDK downloads, 10,000+ active servers, adoption by OpenAI, Microsoft, and Google.
The debate has been running since day one: Is MCP the right abstraction? Hacker News threads, Reddit arguments, blog posts from skeptics and believers. Some call it the future of AI tooling. Others call it over-engineered middleware that'll be obsolete in two years.
At StackOne, we've been deep in MCP for the past year. Building servers, debugging auth flows, watching the spec evolve. This is what we've learned about where it's been, where it's going, and whether it'll survive the next wave of AI development.
MCP didn't start as a finished product. It started as an experiment in standardization.
The original spec (November 2024) was simple: JSON-RPC over Server-Sent Events, basic tool definitions, a way for models to discover available actions. Good enough to prove the concept.
Security incidents forced auth maturity. The March 2025 spec added OAuth 2.1 and tool annotations (read-only vs destructive). The June spec separated authorization servers from resource servers after real-world exploits showed the original model was too trusting.
Enterprise adoption exposed multi-tenant gaps. When Asana launched their MCP integration in June 2025, security researchers discovered a "Confused Deputy" vulnerability. The MCP server cached responses but failed to re-verify tenant context on subsequent requests. Customer A's cached data became visible to Customer B. Classic tenant isolation failure, and exactly the kind of bug MCP's original single-user design didn't anticipate.
Scale broke the original transport. SSE worked fine for local development. It didn't work for production systems handling thousands of concurrent connections. Streamable HTTP replaced it.
The ecosystem needed discovery. By early 2025, there were 1,000+ community servers scattered across npm, PyPI, and random GitHub repos. The September Registry launch gave clients a single source of truth.
Discovery is useful. An AI agent can ask "what can you do?" and get a structured answer. No hardcoded tool lists, no stale documentation. When a server adds capabilities, clients see them immediately.
The abstraction level fits simple cases. For "connect Claude to my Notion" or "let GPT-4 read my GitHub issues," MCP is much simpler than building custom integrations. Install a server, configure auth, done.
Vendor neutrality matters. The same MCP server works with Claude, GPT-4, Gemini, and local models. Write once, connect everywhere.
After a year of MCP in production, here's what keeps breaking:
The spec technically supports OAuth 2.1. But the SDKs and reference implementations still assume MCP servers are also authorization servers. Integrating with enterprise IdPs (Okta, Azure AD) requires workarounds.
Token lifecycle management is particularly painful. When you're using third-party authorization, you need complex token mapping and tracking. The spec says what to do; the tooling makes it hard to actually do it.
MCP was designed for single-user, local deployment. "Connect my AI to my tools."
Enterprise SaaS needs multi-tenant: many users sharing infrastructure, each accessing only their data. The protocol has no native concept of tenant isolation. You can build it, but you're on your own for:
The Asana vulnerability was a Confused Deputy attack: the server trusted cached context without re-verifying who was asking. MCP's spec doesn't prevent this. You have to build tenant isolation yourself.
If you're deploying AI agents to a team, you need admin controls. Which MCP servers can users connect? What tools are allowed per user or role? How do you audit what agents did?
MCP doesn't have answers here. It's a protocol for connecting models to tools, not for managing those connections at scale.
The Registry helps clients find servers. But there's no standard for agents to discover tools within a running session based on context.
"I'm working on a Jira ticket" should automatically surface relevant tools. "I'm in a code review" should show a different set. Right now, this logic lives in application code. A built-in context-aware discovery mechanism would make agents smarter about what's relevant.
MCP is request-response: client asks, server answers. But real integrations need push. A new email arrived. The build finished. Someone commented on your PR.
The spec has hooks for this, but the implementation guidance is thin and tooling support is inconsistent.
The developer sentiment on MCP is... mixed. Reddit threads and GitHub discussions paint a picture of genuine frustration alongside cautious optimism.
On server quality: "95% of MCP servers are utter garbage." That's a real quote from the MCP subreddit, and it reflects a common complaint. The barrier to publishing an MCP server is low. The barrier to publishing a good one is much higher.
On token overhead: Before Code Mode, developers complained about context windows filling up with tool definitions. Loading 50 tools meant burning 100K+ tokens before the conversation started. This was a real blocker for complex workflows.
On security: Security researchers found prompt injection vulnerabilities in early MCP implementations: tool poisoning, sandbox escapes, the usual suspects when a new protocol meets production traffic. The spec has improved, but the ecosystem of community servers hasn't fully caught up.
On the hype cycle: "MCP is either the future of AI tooling or an elaborate over-engineering exercise." Fair. The protocol adds abstraction between models and tools. That abstraction has costs. Whether it's worth it depends on your use case.
MCP is far from perfect today. It needs significant work around safety, security, and governance to be enterprise-ready. But the trajectory matters: every major pain point from the first year—auth, transport, discovery—has been addressed in subsequent spec revisions. Community adoption tells the real story. Teams shipping MCP in production are increasingly positive, and with each release, there are fewer skeptics and more believers.
This is where it gets interesting. The debate isn't "MCP vs no MCP." It's how agents should call MCP tools.
The traditional approach: load all tool definitions into context, let the model pick which tool to call, feed results back through the model. This works, but it's expensive. Thousands of tools means hundreds of thousands of context tokens just for definitions.
The emerging approach: Code Mode. Instead of calling tools directly, the agent writes code that calls tools. Cloudflare's Agents SDK converts MCP schemas into TypeScript APIs. The LLM writes await mcp.notion.createPage({ title: "..." }) instead of emitting special tool-calling tokens.
Why does this matter?
Anthropic's research shows this approach reduces token usage by 98.7%. That's not a typo.
Code Mode doesn't replace MCP. It makes MCP more efficient by having agents write code that calls MCP tools.
The agents writing code to call APIs (Claude Code, Cursor, Devin) aren't bypassing MCP. They're demonstrating that code-mediated tool access scales better than direct invocation. MCP provides the standardized interface; code mode provides the efficient calling convention.
For MCP to succeed in enterprise:
In November 2025, Anthropic donated MCP to the Linux Foundation's new Agentic AI Foundation. The announcement framed it as "vendor-neutral governance" and "ensuring the protocol's long-term future."
The community cheered: no single company controlling the standard, open governance, the protocol equivalent of a public good.
We're not so sure.
Our concern: MCP isn't ready for stasis.
When a protocol moves to foundation governance, development slows down. That's the point: stability over velocity. But MCP still has fundamental gaps: multi-tenancy, admin controls, context-aware discovery, real-time notifications. These aren't edge cases. They're blockers for enterprise adoption.
Foundation governance means committees, consensus, and careful deliberation. It means the protocol will evolve at the speed of agreement, not the speed of need. Meanwhile, competitors with centralized control (OpenAI's tool ecosystem, Google's agent protocols) can iterate fast and ship features.
Is Anthropic stepping back because they know MCP won't be the winner?
We suspect the handoff signals something: Anthropic sees MCP as a standard for some use cases, not the universal protocol for AI-tool interaction. By giving it to the Foundation, they can focus on what matters (making Claude better) without being responsible for a protocol that might only capture part of the market.
MCP might become the USB-C of AI tooling: widely adopted, good enough for most cases, but not the only connector in town. Or it might ossify while something more nimble takes its place.
The next year will determine whether MCP becomes the TCP/IP of AI tooling or a stepping stone to something else.
We think both paths coexist.
MCP for structured, high-trust integrations. Enterprise workflows where you want explicit permissions, audit logs, and vendor-neutral tooling.
Code Mode for exploratory, developer-centric use cases. When agents need flexibility and you trust them with broader system access.
The teams that win won't be the ones who pick a side. They'll understand when to use which.
MCP is technical plumbing. It connects models to tools. But it doesn't tell agents when to use which tools, or how to combine them for specific outcomes.
That's where skills come in. Skills are the abstraction layer that makes MCP usable: they encapsulate specific MCP servers and tools into outcome-oriented packages without overloading the main context.
The Agent Skills spec (not just Claude's implementation) defines skills as reusable capability modules. Drop a skill file and the agent gains a new domain: code review patterns, deployment workflows, research methodologies. The skill can specify which MCP servers to use, what tools to invoke, and how to combine them.
Two things make skills interesting:
Use-case framing. MCP describes tools technically ("POST to this endpoint with this schema"). Skills describe outcomes ("review this PR for security issues"). Agents reason better about outcomes than API specs.
Context isolation. An agent with 50 MCP servers has 50 sets of tool definitions competing for context. Skills let you load only what's relevant: activate the "deploy" skill, get only deployment-related tools. The rest stays out of context until needed.
Skills won't replace MCP. They'll make it more accessible. MCP provides the standard interface; skills provide the higher-level composition that makes those interfaces useful.