Skip to content

feat(messages): implement prompt caching metrics tracking#5469

Merged
cdoern merged 2 commits intollamastack:mainfrom
cdoern:feat/messages-cache-metrics
Apr 14, 2026
Merged

feat(messages): implement prompt caching metrics tracking#5469
cdoern merged 2 commits intollamastack:mainfrom
cdoern:feat/messages-cache-metrics

Conversation

@cdoern
Copy link
Copy Markdown
Collaborator

@cdoern cdoern commented Apr 7, 2026

Summary

Implements prompt caching metrics tracking for the Anthropic Messages API by mapping OpenAI's cache metrics to Anthropic's cache fields.

Changes

  • Maps usage.prompt_tokens_details.cached_tokenscache_read_input_tokens in non-streaming responses
  • Maps cache metrics in streaming responses via the final MessageDeltaEvent
  • cache_creation_input_tokens remains None (OpenAI does not provide this metric)

Implementation

  • Updated _openai_to_anthropic() to extract and map cache metrics
  • Updated _stream_openai_to_anthropic() to track and emit cache metrics
  • Added defensive checks for prompt_tokens_details existence

Testing

Added 2 unit tests:

  • test_cache_metrics_mapping - verifies cache metrics are properly mapped when present
  • test_cache_metrics_missing - verifies graceful handling when cache metrics are absent

All 19 unit tests passing.

Test Plan

uv run pytest tests/unit/providers/inline/messages/ -v

🤖 Generated with Claude Code

Add support for mapping OpenAI's cache metrics to Anthropic's cache fields.
Maps usage.prompt_tokens_details.cached_tokens to cache_read_input_tokens
in both non-streaming and streaming responses. cache_creation_input_tokens
remains None as OpenAI does not provide this metric.

Includes unit tests for both scenarios (cache metrics present and missing).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Charlie Doern <cdoern@redhat.com>
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 7, 2026
@cdoern cdoern marked this pull request as ready for review April 7, 2026 19:16
@cdoern cdoern enabled auto-merge April 14, 2026 19:54
@cdoern cdoern added this pull request to the merge queue Apr 14, 2026
Merged via the queue into llamastack:main with commit bc61421 Apr 14, 2026
64 checks passed
@cdoern cdoern deleted the feat/messages-cache-metrics branch April 14, 2026 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants