feat(docs): some MCP client best practices#2582
feat(docs): some MCP client best practices#2582alexhancock wants to merge 1 commit intomodelcontextprotocol:mainfrom
Conversation
|
|
||
| This section covers two complementary patterns that address these scaling challenges: **progressive discovery**, which controls _when_ tool definitions enter context, and **programmatic tool calling**, which controls _how_ tools are invoked. | ||
|
|
||
| ### Progressive Discovery of Servers and Tools |
There was a problem hiding this comment.
| ### Progressive Discovery of Servers and Tools | |
| ### Avoiding context bloat using progressive discovery of Servers and Tools |
| **Layer 1 — Catalog.** The client exposes a small number of meta-tools that let the model search and browse available capabilities. A `search_tools` tool accepts a natural-language query and returns a list of matching tool names with brief descriptions. This is analogous to browsing an API reference rather than reading every page. | ||
|
|
||
| ```typescript | ||
| // The model calls a lightweight search tool | ||
| search_tools({ query: "update salesforce record" }) | ||
|
|
||
| // Returns concise matches — names and one-line descriptions only | ||
| → [ | ||
| { name: "salesforce.updateRecord", description: "Update fields on a Salesforce object" }, | ||
| { name: "salesforce.upsertRecord", description: "Insert or update based on external ID" } | ||
| ] | ||
| ``` |
There was a problem hiding this comment.
We likely wnat to tell people that models often have been trained on this and some platforms have support for this: https://developers.openai.com/api/docs/guides/tools-tool-search, https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool, and you can either rely on the platform or use a custom implementation, but should prefer a tool search tool when available.
| **Layer 2 — Inspect.** Once the model identifies a relevant tool, it can request the full definition — input schema, output schema, and detailed documentation — for just that tool. This keeps the context focused. | ||
|
|
||
| ```typescript | ||
| // The model inspects only the tool it needs | ||
| get_tool_details({ name: "salesforce.updateRecord" }) | ||
|
|
||
| // Returns the complete schema for this single tool | ||
| → { | ||
| name: "salesforce.updateRecord", | ||
| description: "Updates a record in Salesforce", | ||
| inputSchema: { | ||
| type: "object", | ||
| properties: { | ||
| objectType: { type: "string", description: "Salesforce object type" }, | ||
| recordId: { type: "string", description: "Record ID to update" }, | ||
| data: { type: "object", description: "Fields to update" } | ||
| }, | ||
| required: ["objectType", "recordId", "data"] | ||
| } | ||
| } | ||
| ``` | ||
|
|
There was a problem hiding this comment.
When you add tools you need to be careful of invalidating prompt caching when inserting tool definitions.
|
|
||
| A well-structured progressive discovery system operates in three layers: | ||
|
|
||
| **Layer 1 — Catalog.** The client exposes a small number of meta-tools that let the model search and browse available capabilities. A `search_tools` tool accepts a natural-language query and returns a list of matching tool names with brief descriptions. This is analogous to browsing an API reference rather than reading every page. |
There was a problem hiding this comment.
We might want to highlight that this is one strategy and leans heavily on the learnings from progressive discovery in skills. Harnesses should use the principle, but are free to implement somewhat custom versions, e.g. using subagents to select tools instead of BM25 or regex search, or use vector embeddings, or add a short description to the system prompt initially.
They should alos probably define a threshold after which you start progressive discovery while below threshold you want the full tool description to be added to the system prompt.
| Client-->>Model: Server disconnected, context freed | ||
| ``` | ||
|
|
||
| This pattern is especially valuable for general-purpose agents where the user's intent isn't known in advance. The agent can start with a minimal set of always-on servers and expand its capabilities as the conversation evolves. |
There was a problem hiding this comment.
its particularly valuable when combinig it with skills, only adding tools and servers in specific skills that are fully loaded by the model
| | **Cache tool definitions** | Once a tool definition is loaded, keep it available for the duration of the session. | | ||
| | **Group tools by server** | Present tools organized by their source server so the model can reason about related capabilities. | | ||
|
|
||
| ### Programmatic Tool Calling |
There was a problem hiding this comment.
| ### Programmatic Tool Calling | |
| ### Programmatic Tool Calling / Code Mode |
| <Tip> | ||
| Servers that define an | ||
| [`outputSchema`](/specification/draft/server/tools#output-schema) on their | ||
| tools dramatically improve the quality of generated APIs. When an output | ||
| schema is present, the client can produce precise return types (like | ||
| `LogEntry` above) instead of generic `any` types — giving the model type | ||
| information that leads to more correct code with fewer errors. | ||
| </Tip> |
There was a problem hiding this comment.
When the outputSchema is missing, there are two techniques:
- Use a generic
anyorstringtype and deal with it. - Pass the result of a call to a fast model like Haiku or Gemini Flash and instruct it to return a specific type, e.g.
extract(mcpTool('generic_tool', params), Model.AnthropicHaiku, ExpectedType))whereExpectedTypeis a definition of a type the model can generate in advance. If the conversion fails, the model can handle that as well and fallback to genericstringor fail
| LLMs have been trained on vast amounts of real-world code. They are significantly more capable at writing function calls in a popular programming language than at producing the synthetic JSON format used by tool-calling protocols. This has several practical consequences: | ||
|
|
||
| | Benefit | Explanation | | ||
| | ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | ||
| | **Better tool selection** | Models handle larger tool catalogs more accurately when tools are presented as typed functions with doc comments than as tool-call JSON schemas. | | ||
| | **Data stays out of context** | Intermediate results flow between tools inside the sandbox. The model only sees what the code explicitly logs or returns, dramatically reducing token consumption. | | ||
| | **Batched execution** | Multiple tool calls execute in a single round trip. A script that reads five files and writes a summary makes one trip to the model instead of eleven. | | ||
| | **Native control flow** | Loops, conditionals, error handling, and retries are expressed in code rather than requiring multiple model turns to orchestrate. | | ||
| | **Composability** | The model can define helper functions, reuse variables, and build up complex operations — things that are awkward or impossible with sequential tool calls. | |
There was a problem hiding this comment.
I don't think this is true. The real value is that you are getting composition without having to go through inference. LlMs have also been trained on the specific tool calling protocol quite extensively.
| ); | ||
| ``` | ||
|
|
||
| **Step 3 — The sandbox executes the code.** Function calls inside the sandbox are intercepted and routed back to the appropriate MCP server through the client. The log data and ticket creation flow directly between servers without ever entering the model's context. Only the `console.log` output — a single summary line — returns to the model. |
There was a problem hiding this comment.
We need to dig into this more. This is actually a big hurdle for harness implementors. Sandbox selection is hard and knowing which one to pick. Should they pick Deno/V8 Isolates, Monty (Pydantic's Python Sandbox), mlua? We should at least list some options here.
Motivation and Context
Start a Client Best Practices section to provide info to client implementors on patterns that lead to higher quality clients
How Has This Been Tested?
Local render of the docs
Breaking Changes
N/A
Types of changes
Checklist
Additional context
N/A