Skip to content

[notes] Agents WG Meeting, October 28th, 2025 #1726

@LucaButBoring

Description

@LucaButBoring

Agents WG Meeting Notes

Date: October 28, 2025

Attendees

  • Luca Chang
  • Che Liu
  • John Harris
  • Jonathan Hefner
  • Peder Holdgaard Pedersen
  • Peter Alexander
  • Shaun Smith

Discussion

The working group reviewed SEP-1686: Tasks, which supersedes the earlier SEP-1391: Long-Running Operations proposal. The new approach focuses on data-layer concerns and avoids changing tool call semantics, instead relying on notification-based patterns for progress updates.

Server-to-Client Task State Management

The proposal allows servers to request task augmentation from clients, raising questions about distributed state management. While clients are typically not distributed, some implementations may be. This creates implementation complexity around whether the host application or the MCP client should own the task store.

In the reference implementation, the client needs to expose an extension point for task storage, but only the host application can actually fulfill that requirement, separating their responsibilities.

Task-Augmentation Opt-In Mechanism

A key concern is whether clients should expect any operation (like tasks/list) to potentially take minutes or hours when task-augmented. MCP requests can already be slow, and the SEP does not change this. However, the SEP clarifies that receivers choose whether to accept task augmentation for specific requests, allowing servers to refuse task augmentation on requests that it expects will be quick.

However, this design raises a discoverability problem: clients need a way to know which requests support task augmentation, or else the expectation must be that they attempt to augment every request with a task. The group discussed using either capabilities or tool annotations to signal this support, and left it up to the SEP to clarify.

ListTasks Security and Consistency

The tasks/list operation needs filtering capabilities for security reasons, as clients should not see tasks from other contexts. The group agreed on a consistency requirement: if a client can call tasks/get to retrieve a specific task, that task must appear in tasks/list results for that client.

There was also discussion about bidirectional symmetry—whether it makes sense for servers to call tasks/list on a client. This would only be relevant if the client supports server-to-client task augmentation and would require explicit configuration, but it should be allowed under those circumstances.

Relationship to TrackElicitation

The group noted potential overlap with SEP-1036: URL Mode Elicitation, which introduces an elicitation/track method carrying a progress token. While elicitation/track and Tasks serve different purposes, they may be redundant in some scenarios. The group recommended coordinating with the proposers of #1036 to clarify the relationship.

Implementation Patterns and SDK Concerns

The proposed beginX pattern (e.g., beginCallTool returning a PendingRequest) needs validation with SDK maintainers across different languages. In C#, "Task" already has a well-established meaning in the standard library, which will require awkward naming like McpTask to avoid confusion.

Idempotency and Client-Generated IDs

The group debated whether idempotency should be tightly coupled to tasks. The original rationale was twofold:

  1. Tasks are generally long-running and expensive, making idempotency important for retry scenarios
  2. Client-provided task IDs enable future subtask association features, allowing servers to pre-allocate IDs for follow-up operations

Client-generated IDs provide valuable message association benefits, particularly for future subtask extensions. However, this comes at a complexity cost for servers that simply want to expose existing workflow APIs.

More broadly, the group observed that request association and idempotency are useful features independent of the full task lifecycle. Requiring implementers to adopt the complete task model to get these benefits may be unnecessarily restrictive.

Reference Implementation Evolution

The Python reference implementation from #1391 became complex after incorporating feedback, particularly around the task broker API. For a v0 implementation, a simpler approach seemed more appropriate, leading to the current TypeScript reference implementation.

The core client changes (excluding server-to-client task augmentation) primarily involve splitting RPC methods into beginX variants that return PendingRequest objects. This simplified scope makes it feasible to create a streamlined Python implementation for real-world testing with the HuggingFace Jobs API (see Follow-ups).

Follow-ups

Integration and Testing:

  • @lucalc will collaborate with @evalstate on integrating with the HuggingFace Jobs API to validate the design's usability in practice
  • @pwwpche will reach out to @ihrpr about engaging other SDK and client developers on the beginX method splitting pattern

Specification Updates:

  • DONE Add a consistency requirement that any task retrievable via tasks/get must appear in tasks/list for that client
  • DONE Define a mechanism (capabilities or annotations) to signal which operations support task augmentation, avoiding unnecessary augmentation attempts
    • DONE Using both capabilities and tool annotations to signpost this
  • PENDING Reconsider the coupling between idempotency and tasks—at minimum, make idempotency a MAY requirement rather than a MUST. Ideally, extract generally useful concepts (idempotency, message association) into separate features that don't require full task support
    • DONE Task idempotency is no longer required

Metadata

Metadata

Assignees

No one assigned

    Labels

    notesNotes from meetings and discussions. Used for tracking purposes only, no action is needed.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions