Skip to content

SEP-2549: TTL for List Results#2549

Open
CaitieM20 wants to merge 4 commits intomodelcontextprotocol:mainfrom
CaitieM20:ttlSEP
Open

SEP-2549: TTL for List Results#2549
CaitieM20 wants to merge 4 commits intomodelcontextprotocol:mainfrom
CaitieM20:ttlSEP

Conversation

@CaitieM20
Copy link
Copy Markdown
Contributor

Motivation and Context

This SEP proposes adding an optional ttl (time-to-live) field to the result objects returned by tools/list, prompts/list, resources/list, and resources/templates/list. The TTL tells clients how long the response may be considered fresh before re-fetching. This allows clients to cache feature lists and poll on a predictable schedule, reducing reliance on server-push list_changed notifications while remaining fully backward compatible. TTL supplements rather than replaces the existing notification mechanism — both can coexist.

See SEP for more details.

How Has This Been Tested?

Not yet

Breaking Changes

No

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

This is part of the list of transport priorities agreed upon by core maintainers in December 2025 blob post

@CaitieM20 CaitieM20 requested a review from a team as a code owner April 9, 2026 19:47
@CaitieM20 CaitieM20 changed the title SEP-XXXX: TTL for List Results SEP-2549: TTL for List Results Apr 9, 2026
@CaitieM20 CaitieM20 requested a review from a team as a code owner April 9, 2026 19:50
@CaitieM20 CaitieM20 added SEP draft SEP proposal with a sponsor. labels Apr 9, 2026
@CaitieM20 CaitieM20 added the transport Related to MCP transports label Apr 9, 2026
@CaitieM20 CaitieM20 self-assigned this Apr 9, 2026
@CaitieM20 CaitieM20 requested a review from kurtisvg April 9, 2026 19:51
Copy link
Copy Markdown
Contributor

@pja-ant pja-ant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for putting this together so quickly. A few relatively minor things inline.


```typescript
/** @internal */
export interface PaginatedResult extends Result {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While convenient (as all list results extend this), I feel like this belongs in a different interface. Not all paginated things may have TTLs and if we introduce something in future that wants pagination but no TTL (e.g. a tool result returning a paginated list of things) then I think we'd regret this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally fair, I can pull this out of the paginated interface

* seconds after receiving the response. The client SHOULD NOT re-fetch
* before the TTL expires unless it receives a list_changed notification.
*/
ttl?: number;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ttlSeconds to encode units in schema?

Sadness: I just noticed that Task has ttl (and pollInterval) and it is milliseconds :(

(cc @LucaButBoring - may want to consider fixing this while it is experimental...)

}
```

> **Open Question — TTL format**: An alternative representation is an ISO 8601 duration string (e.g., `"PT5M"` for 5 minutes). Integer seconds are simpler, consistent with HTTP `max-age`, and easier to compare arithmetically. ISO 8601 durations are more human-readable and used in some Azure/AWS APIs. Community input is welcome on which format to adopt. The remainder of this specification uses integer seconds for illustration.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seconds are good IMO - keeps it simple


## Abstract

This SEP proposes adding an optional `ttl` (time-to-live) field to the result objects returned by `tools/list`, `prompts/list`, `resources/list`, and `resources/templates/list`. The TTL tells clients how long the response may be considered fresh before re-fetching. This allows clients to cache feature lists and poll on a predictable schedule, reducing reliance on server-push `list_changed` notifications while remaining fully backward compatible. TTL supplements rather than replaces the existing notification mechanism — both can coexist.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider resource/read here also, so that you don't need to use resource subscriptions to check resource freshness?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to say exactly the same thing.

The overall goal here is to make notifications (which essentially require an SSE stream) a purely optional optimization. That implies that anything that we should attach TTLs to everything that currently has a notification. Looking at the schema, I see the following notification types that I think should be included in this SEP:

  • ResourceListChangedNotification: already covered here
  • PromptListChangedNotification: already covered here
  • ToolListChangedNotification: already covered here
  • ResourceUpdatedNotification: we should add this

I think the remaining notification types should not be added here, for the following reasons:

  • InitializedNotification: going away in SEP-1442
  • CancelledNotification: not relevant (and shouldn't be needed anywhere but the stdio transport once we have MRTR)
  • RootsListChangedNotification: not relevant (and since this is a client-generated notification, we should probably remove this in a post-MRTR world)
  • ProgressNotification: not relevant (might go away in favor of tasks in the long run)
  • TaskStatusNotification: not relevant (and may get removed as we revamp tasks?)
  • LoggingMessageNotification: not relevant (and may get removed?)
  • ElicitationCompleteNotification: not relevant (and maybe we should removing this as part of MRTR, unless we decide to keep URL elicitations in the ephemeral workflow)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the goal, have we considered a field where the server would return a server version per request? This would allow the client to invalidate its cache when a new server version is detected, which would be more direct than a TTL.

Server versioning is a big subject with various pitfalls, but assuming that most tools stay relatively constant between versions it's generally likely that the updated version will allow the client to adapt most seamlessly in most cases.

The main issue generally with a TTL is that it will be hard to set except in cases where servers deploy consistently at a given time.


### No new capability flag

No new capability flag is needed. The `ttl` field is optional on the response object. Servers that do not wish to provide a TTL simply omit the field. Clients that do not understand the field ignore it per standard JSON handling of unknown properties.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clients that do not understand the field ignore it per standard JSON handling of unknown properties.

I think this might want to be left out -- it implies clients can ignore it which contradicts the previous SHOULD's

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

### Error handling

- If `ttl` is present but is not a non-negative integer, the client SHOULD ignore it and behave as if it were absent.
- Clients MUST NOT treat a missing `ttl` as an implicit TTL of 0 or any other value.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: SHOULD NOT?

I don't think we can stop them...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops got too many double negatives there and its confusing, rephrased. Goal is to say if its a negative integer the clients SHOULD ignore it.

Copy link
Copy Markdown

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up, Caitie! Overall, I think this is definitely the right direction.


## Abstract

This SEP proposes adding an optional `ttl` (time-to-live) field to the result objects returned by `tools/list`, `prompts/list`, `resources/list`, and `resources/templates/list`. The TTL tells clients how long the response may be considered fresh before re-fetching. This allows clients to cache feature lists and poll on a predictable schedule, reducing reliance on server-push `list_changed` notifications while remaining fully backward compatible. TTL supplements rather than replaces the existing notification mechanism — both can coexist.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to say exactly the same thing.

The overall goal here is to make notifications (which essentially require an SSE stream) a purely optional optimization. That implies that anything that we should attach TTLs to everything that currently has a notification. Looking at the schema, I see the following notification types that I think should be included in this SEP:

  • ResourceListChangedNotification: already covered here
  • PromptListChangedNotification: already covered here
  • ToolListChangedNotification: already covered here
  • ResourceUpdatedNotification: we should add this

I think the remaining notification types should not be added here, for the following reasons:

  • InitializedNotification: going away in SEP-1442
  • CancelledNotification: not relevant (and shouldn't be needed anywhere but the stdio transport once we have MRTR)
  • RootsListChangedNotification: not relevant (and since this is a client-generated notification, we should probably remove this in a post-MRTR world)
  • ProgressNotification: not relevant (might go away in favor of tasks in the long run)
  • TaskStatusNotification: not relevant (and may get removed as we revamp tasks?)
  • LoggingMessageNotification: not relevant (and may get removed?)
  • ElicitationCompleteNotification: not relevant (and maybe we should removing this as part of MRTR, unless we decide to keep URL elicitations in the ephemeral workflow)


This approach has several limitations:

1. **Stateless and HTTP-based transports**: Clients communicating over stateless transports (e.g., pure HTTP request/response without SSE or WebSocket) cannot receive server-push notifications. These clients have no guidance on when to re-poll and must either poll excessively or risk stale data.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, the current notification mechanism does work with SSE streams. However, SSE streams are a problem to support in a lot of environments, for the same reasons that we argued in the MRTR SEP. So it might make sense to explicitly say here that we want to make SSE streams a purely optional part of the protocol, used only as an optimization.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup fair, updated to clarify

* SHOULD NOT serve a cached copy.
* - If positive, the client SHOULD consider the list fresh for this many
* seconds after receiving the response. The client SHOULD NOT re-fetch
* before the TTL expires unless it receives a list_changed notification.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also make allowances for the client to re-fetch even before the TTL has expired if it has some other reason to believe that the list has been invalidated. For example, if the client makes a tool call and gets a result back with isError set to true, that indicates some sort of validation error, which might be caused by the tool schema having changed since the last time the tool list was fetched.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added


Clients SHOULD NOT treat TTL as a polling interval that triggers automatic background refetches. The TTL is a **freshness hint**: the client checks freshness when it needs the list, and re-fetches only if stale. Implementations that do choose to poll SHOULD apply jitter and backoff.

### Interaction with `list_changed` notifications
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we considered how this will affect the SDK APIs? There probably aren't any major problems here, but let's do our due diligence.

I know that (e.g.) the python SDK automatically fetches the tool list if not cached before sending a tool call, so that it can perform input schema validation. That part can obviously just look at the TTL to determine when to refresh the tool list.

However, I think there's also a direct API call to fetch the tool list and return it to the application. I guess we'd need to change that API to return the cached tool list if we haven't yet hit the TTL, right? Or would we want it to proactively re-fetch, because it was explicitly asked to do so by the application?

What happens if the application is expecting that the tool list it was given by the SDK remains valid until it receives a list changed notification, but the server doesn't support list changed notifications? Given that tool list notifications are optional even today, I guess this is already possible, but I'm not sure how the SDKs handle this -- we should make sure there aren't any surprises here.


When a list result includes `nextCursor` (indicating more pages), the `ttl` applies to the **entire paginated list**, not to individual pages. Specifically:

- The TTL SHOULD only appear on every page with the same value. Clients SHOULD use the TTL from the last page they fetched to determine freshness.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. What happens if a client takes its sweet time fetching all the pages? Let's say the TTL is 1 hour, and it takes 45 mins for the client to fetch all the pages. It seems like this might artificially inflate the TTL.

Actually, what happens if the list gets updated between fetches of different pages? Presumably this is a problem even today, but how does the client know that the two pages actually come from two different lists? I'm wondering if we should have a generation ID on the paginated reponse, so that the client can tell if the list changes in the middle of fetching the pages.

(I realize that this may be slightly tangential to the main point of this SEP, but we should at least consider if there are problems here.)


Integer seconds is the most common representation across these systems.

### Why not use HTTP caching directly?
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At one point we were considering ETags. Did we discard that approach?

The reason I liked it is it means if tool descriptions haven't changed for weeks, you don't have to refetch at all. It also gives a good indicator of whether the client has actually refetched, so it makes it easier for a server to reject requests if it really cares.

A TTL is conceptually simpler, and maybe has nicer failure modes (e.g. does setting an ETag and never rejecting mean a client never re-lists? is that what you want?), but doesn't give the same level of control to server authors.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed in transports and we just wanted to keep things simple for now since we're close to deadline. Could add the ETag stuff next version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

draft SEP proposal with a sponsor. SEP transport Related to MCP transports

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

6 participants