feat: add automatic key failover for AI Bridge OpenAI#24847
Merged
Conversation
Contributor
Author
This stack of pull requests is managed by Graphite. Learn more about stacking. |
d60c20c to
358cc72
Compare
c486d89 to
cb3d525
Compare
358cc72 to
b282407
Compare
cb3d525 to
afef97c
Compare
b282407 to
ae98c2d
Compare
afef97c to
615b83c
Compare
ae98c2d to
541a917
Compare
615b83c to
6dfa0c0
Compare
337ee29 to
2f7c02d
Compare
6dfa0c0 to
1ae6384
Compare
c38a2a8 to
8ac3606
Compare
9f7d1d5 to
cee332e
Compare
38ed74c to
866efb9
Compare
37e9958 to
cfd1a7a
Compare
866efb9 to
3ddd9a2
Compare
cfd1a7a to
f13fb00
Compare
3ddd9a2 to
8590bdf
Compare
f13fb00 to
63d2574
Compare
02cc359 to
ca5e0ce
Compare
1f50a85 to
38e4faa
Compare
pawbana
reviewed
May 5, 2026
554ca68 to
a6072ef
Compare
4ae41e2 to
e9b9e65
Compare
3acbc6f to
0ea0412
Compare
4bd785f to
2682942
Compare
0ea0412 to
14fca1b
Compare
pawbana
approved these changes
May 7, 2026
| // Then: 1 request, 429 response, no failover, upstream | ||
| // Retry-After propagated to the client. | ||
| name: "byok_no_failover", | ||
| byokKey: "user-byok", |
Contributor
There was a problem hiding this comment.
Should keys be defined here? It would test that BYOK has precedence over keys.
| }, | ||
| expectedRequestCount: 3, | ||
| expectedSeenKeys: []string{"k0", "k0", "k1"}, | ||
| expectedStatusCode: http.StatusTooManyRequests, |
Contributor
There was a problem hiding this comment.
Retry-After could be checked?
| } | ||
|
|
||
| // mockServerProxier is a test implementation of mcp.ServerProxier. | ||
| type mockServerProxier struct { |
Contributor
There was a problem hiding this comment.
Could be moved to testutil package.
|
|
||
| // stubToolCaller is a minimal mcp.ToolCaller that returns a fixed | ||
| // text result, so the agentic continuation can proceed. | ||
| type stubToolCaller struct{} |
Contributor
There was a problem hiding this comment.
Copy paste from Anthropic tests.
| expectedNil bool | ||
| expectedStatus int | ||
| expectedRetryAfter time.Duration | ||
| }{ |
Contributor
There was a problem hiding this comment.
test case with nil error could be added
Contributor
Author
Merge activity
|
14fca1b to
3d646d3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Description
Adds automatic key failover for centralized OpenAI provider, covering both chat completions and responses APIs. Same shape as the Anthropic PR: each upstream call walks the configured key pool, keys are marked temporary on 429 (with cooldown from
Retry-After) and permanent on 401/403. Each agentic-loop iteration gets its own fresh walker so a tool-call continuation can fail over independently of the initial request.BYOK is unchanged: BYOK requests run as a single attempt with no failover.
Changes
config.OpenAIcarries aKeyPool.Keyremains for BYOK Authorization Bearer set per interception.newChatCompletionWithKeyFailover, marks keys on key-specific failures, returns on first success or non-failover error.newResponseWithKeyFailoverparallel to chatcompletions.Related Issues
Related to: coder/internal#1446
Related to: https://linear.app/codercom/issue/AIGOV-197/aibridge-automatic-key-failover-for-bridged-and-passthrough-routes
Follow-up PRs
Note
Initially generated by Claude Opus 4.7, modified and reviewed by @ssncferreira