fix(litellm): store bookkeeping span off-band, not in forwarded metadata#6598
fix(litellm): store bookkeeping span off-band, not in forwarded metadata#6598jgreer013 wants to merge 2 commits into
Conversation
The LiteLLM integration stored its live Span via `metadata["_sentry_span"] = span`. litellm threads the caller's `metadata` dict through to `litellm_params`, and some providers (e.g. Anthropic's /v1/messages passthrough) forward that dict into the outbound request body, so `json.dumps(request_body)` raised `TypeError: Object of type Span is not JSON serializable` before the request was even sent. The span (which holds verbatim prompts under send_default_pii) could also leak to the provider. Store the span off-band in a module-level registry keyed by `litellm_call_id` (falling back to the callback kwargs identity) instead of inside any litellm-visible dict, and remove the entry in the terminal success/failure callback. Fixes getsentry#6596 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address self-review: the module-level span registry was only evicted by the terminal success/failure callback, so a call abandoned before a terminal callback fires (e.g. a stream the caller stops iterating) leaked its Span entry -- holding prompt data -- for the process lifetime. The prior kwargs-scoped storage was GC'd with the request, so this was a regression. Back the registry with an OrderedDict capped at _MAX_TRACKED_SPANS and evict oldest-first in _store_span, so abandoned calls cannot grow it unbounded. A WeakValueDictionary is not an option here: Span/Transaction objects are not weakly referenceable. Add tests for the bound, terminal-callback cleanup, and the litellm_call_id-absent fallback key; correct the registry comment. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6fb4fa5 to
5a5f39b
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 5a5f39b. Configure here.
| _spans_by_call[key] = span | ||
| _spans_by_call.move_to_end(key) | ||
| while len(_spans_by_call) > _MAX_TRACKED_SPANS: | ||
| _spans_by_call.popitem(last=False) |
There was a problem hiding this comment.
Registry eviction skips span exit
Medium Severity
When _store_span caps the off-band registry, evicted entries are removed with popitem without calling __exit__ on the stored span, even though _input_callback already entered it. A request whose mapping is evicted while still in flight then hits _peek_span as None on success and returns early, so the span never finishes and completion data is skipped under sustained concurrent load.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 5a5f39b. Configure here.


Summary
Fixes #6596 (Linear PY-2540).
With
LiteLLMIntegrationenabled, any call that passes callermetadatacould crash during request serialization:Root cause
_input_callbackstored the live span via_get_metadata_dict(kwargs)["_sentry_span"] = span. litellm threads the caller'smetadatadict through tolitellm_params["metadata"]— it's the same dict object the caller passed — and some providers (e.g. Anthropic's/v1/messagespassthrough) forward that dict into the outbound request body. So the liveSpanlanded atrequest_body["metadata"]["_sentry_span"]andjson.dumps(request_body)failed before the request was sent.Separately, under
send_default_pii=True+include_prompts=True, that span'sgen_ai.request.messagesholds the verbatim prompt, so any sink that serialized the injected metadata could leak prompt content to the provider.Fix
Store the bookkeeping span off-band in a module-level registry keyed by
litellm_call_id(a per-request UUID that is stable across the input/success/failure callbacks), falling back to the identity of the shared callbackkwargsdict for direct callback invocations that omit it. The span no longer lives in any litellm-visible dict, so it can't be forwarded, serialized, or deep-copied by litellm — fixing both the crash and the prompt-leak vector. The registry entry is removed by the terminal success/failure callback (streaming success peeks and pops only on the final call).Testing
test_caller_metadata_stays_json_serializableasserts the forwardedmetadatastays free of_sentry_spanand JSON-serializable at the serialization moment, while the span is still recorded off-band. It fails onmasterand passes with this change.mypyclean;ruffcheck + format clean.Related (downstream, defense-in-depth)
litellm forwards/serializes this injected metadata into the provider body. Companion issue: BerriAI/litellm#30662. litellm has stripped such span objects in other logging paths (BerriAI/litellm#15728, BerriAI/litellm#12354).