Make e2e tests deterministic for provider-proxy cache#7248
Closed
AntoineToussaint wants to merge 6 commits intomainfrom
Closed
Make e2e tests deterministic for provider-proxy cache#7248AntoineToussaint wants to merge 6 commits intomainfrom
AntoineToussaint wants to merge 6 commits intomainfrom
Conversation
Replace random values (UUIDs, random integers) in LLM provider request bodies with fixed strings. This makes provider-proxy cache keys deterministic across test runs, enabling read-only cache mode in CI. Changes: - providers/common.rs: Remove UUID from bad auth test prompt - providers/anthropic.rs: Remove UUID from thinking test prompt - aggregated_response/mod.rs: Remove UUID suffix from prompt - otel.rs: Use fixed string for tag value - raw_response/embeddings.rs: Use fixed string for embedding input - test_client.py: Remove UUID from extra headers test prompt - test_embeddings.py: Use fixed strings for cache test inputs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These tests need random inputs to avoid hitting the internal Valkey cache from previous runs. The provider-proxy cache will need body sanitization to handle these — they can't be made deterministic without breaking the cache-testing logic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use cache_options.enabled = "write_only" for the first request to bypass Valkey cache reads while still populating the cache for the second request. This makes the tests deterministic without needing random input text. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use fixed ports (19876, 19877) for test image servers instead of OS-assigned port 0, making image URLs in provider requests deterministic - Add raw.githubusercontent.com to no_proxy list so image fetches bypass the provider-proxy (they're not provider API calls) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aaron1011
reviewed
Apr 9, 2026
| tags: HashMap::from([ | ||
| ("first_tag".to_string(), "first_value".into()), | ||
| ("second_tag".to_string(), "second_value".into()), | ||
| ("user_id".to_string(), Uuid::now_v7().to_string()), |
Member
There was a problem hiding this comment.
I don't think this should be needed, since the tags don't get included in model inference requests
Member
|
Would you mind splitting out the changes to tensorzero cache tests (e.g. |
- Use separate payload with "enabled": "on" for the second (cached) request in embeddings test (first request uses "write_only") - Revert providers/common.rs image server to port 0 (uses fetch=true, so URL doesn't reach provider) - Use port 0 for fetch_true image test, fixed port 19876 only for fetch_false (where URL IS in the provider request) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Per review feedback, revert the embeddings cache test changes (write_only approach) so they can be reviewed separately. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Make e2e tests deterministic so that provider-proxy cache keys are the same across test runs. This is a prerequisite for enabling
read-only-require-hitmode in CI (#7205).Changes
Random values in prompts → fixed strings
Uuid::now_v7()from bad auth test promptUuid::now_v7()from thinking test promptuuid7()from extra headers test promptEmbeddings cache tests → use
write_onlymodecache_options.enabled = "write_only"for the first request to bypass Valkey reads while still populating cache for the second requestImage URL tests → fixed ports
GitHub file fetches → bypass proxy
raw.githubusercontent.comtono_proxylist (file fetches, not provider API calls)Audit of remaining random values
There are ~1200 remaining
Uuid::now_v7()calls in e2e tests. None of them affect provider-proxy cache keys because they are all:episode_id/inference_id: TensorZero metadata, never included in the request body sent to LLM providersdummy::models: Don't go through the provider-proxy at allTest plan
cargo clippypasses🤖 Generated with Claude Code