PART 2 OF 5 · PROBLEM-AWARE

Agents don't read your docs

You have a beautiful OpenAPI spec and a docs site humans love. An agent uses neither the way you think. Tool descriptions are the new landing page — and most APIs ship them by accident, if at all.

2026-06-236 minwritten for · API/Platform leadswritten for · Developer-experience ownerswritten for · Heads of Product

There is a comforting story that goes: "We already have an OpenAPI spec, so we're agent-ready." It's comforting because it's almost true, which is the most dangerous kind of wrong. A machine-readable spec is necessary. It is nowhere near sufficient. The gap between "documented" and "agent-usable" is where most APIs quietly fail their first contact with an agent — and never find out, because the agent doesn't file a bug. It just picks a different tool.

An OpenAPI spec is a contract, not an invitation

OpenAPI describes what your endpoints accept and return. It is exhaustive, precise, and written for a developer who has already decided to use you and is now wiring up the call. That reader knows what your product is for. An agent arrives with none of that context. It is holding a task — "reconcile these invoices," "find comparable listings" — and scanning a flat list of available tools to decide which, if any, moves the task forward.

Anthropic's own guidance for people building agent tools is blunt about where the difficulty lives. In its engineering write-up on writing tools for agents, it argues that tool descriptions and definitions deserve the same care you'd give a prompt, because "even small refinements to tool descriptions can yield dramatic improvements," and it frames tool-building as prompt-engineering, not just API-wrapping. [1] Your OpenAPI summary field, auto-generated from a method name, was not written with any of that care. It was written for a linter.

The description is the landing page now

When an agent chooses between tools, the text it reads — the tool's name and description — is doing the exact job your marketing landing page does for a human: it has seconds to convey what this is, when to reach for it, and what makes it the right pick. Except there's no design, no social proof, no gradient hero. There's a sentence. If that sentence says createRecordV2, the agent has to guess. If it says "Create a customer record; use when onboarding a new paying account, not for leads (see createLead)," the agent knows.

Every tool description is a landing page with a one-sentence budget, read by a buyer who will never scroll. Most APIs ship theirs on autopilot and wonder why agents don't convert.

This reframe is uncomfortable because it moves work you thought was "done" — the docs — into a surface you've never staffed. The people who wrote your API summaries were optimizing for correctness. Agent descriptions optimize for selection under ambiguity, which is a different and harder skill.

Two failures you can't see from inside

Discovery failure is when the agent never realizes your tool is relevant. The name is opaque, the description restates the name, and the model — reasoning about which of forty tools fits the task — skips you. You register as noise. No error, no log, no signal. You simply weren't chosen.

Disambiguation failure is the subtler cousin. Your capability was relevant, but you exposed it as fifteen near-identical endpoints, or one endpoint with twelve optional parameters and no guidance on which combination the task needs. The agent, faced with too many indistinct choices, either picks wrong or thrashes. Research on tool use has repeatedly found that agent reliability degrades as the number of available tools and their overlap grows — the model spends its reasoning budget deciding among your tools instead of using them. Anthropic's guidance explicitly recommends consolidating and namespacing tools, and returning results in the form most useful to the model, precisely to avoid this. [1]

Why "just expose every endpoint" backfires

The instinct, once you accept agents are the audience, is to mechanically turn all 200 of your endpoints into 200 tools. This is the worst version. You've now handed the agent a haystack and asked it to reason about paging parameters and internal admin routes it will never legitimately need. The best MCP surfaces are curated: a handful of well-named, well-described, task-shaped tools that map to what someone would actually ask an agent to do — not a one-to-one dump of your routing table. Grouping and naming is design work, and it's the work OpenAPI was never meant to do.

The point of all this

Being "documented" told you that a motivated human developer could integrate you. It tells you nothing about whether an unmotivated, context-poor agent will choose you when it matters. Those are different bars, and the second one is now the one that decides whether you're in the buyer's flow.

So the problem is sharper than "we need an MCP server." It's: the surface an agent judges you by — names, descriptions, groupings, the disambiguation between your tools — is a product surface, and right now nobody owns it. In the next post we'll look at what it actually costs to build and own that surface yourself, because the honest accounting is where most teams get surprised.

REFERENCES

  1. Anthropic — Writing effective tools for agents (engineering blog).
  2. Anthropic — Tool use with Claude (documentation).
  3. Model Context Protocol — Server concepts: tools.
Turn the APIs you already have into an MCP agents can call. Register a platform →