feat(session): add configurable fallback model chain by loss-and-quick · Pull Request #27939 · anomalyco/opencode

loss-and-quick · 2026-05-16T19:56:59Z

Issue for this PR

Related: refactors and extends the approach from #26292.

Type of change

Bug fix
New feature
Refactor / code improvement
Documentation

What does this PR do?

Adds a configurable fallback chain: when the primary model returns a retryable error (rate limit, 5xx, overload, quota), the session switches to the next model in the chain instead of failing. The failed model is parked in a process-local cooldown so subsequent turns skip it until it recovers, then traffic flows back to primary on its own.

{
  "model": "anthropic/claude-sonnet-4-20250514",
  "fallbacks": ["openai/gpt-4.1", "deepseek/deepseek-v4"],
  "cooldown_seconds": 300
}

fallbacks can be set at the top level (applies to the main chat agent) or per-agent. cooldown_seconds defaults to 300. Quota errors (weekly / monthly / "exceeded your …") take a hardcoded 6h cooldown instead, and retry-after / retry-after-ms headers are honoured when the provider sends them.

This is a continuation of #26292 with three correctness fixes that came up while running it:

Model attribution actually updates. The original patch communicated "we used a fallback" by mutating the StreamInput object, but llm.ts spreads that object on the way in (run({ ...input, abort })) so the mutation never reached prompt.ts. As a result the DB kept the primary model, wasOnFallback was always false, the "Switched to …" toast fired on every subsequent turn, and the model name under each assistant message didn't change.

Fix: withFallback publishes a FallbackUsed bus event instead. SessionProcessor.process subscribes for the lifetime of one process call and updates ctx.assistantMessage.modelID/providerID + publishes SessionEvent.Model.Updated. The subscription is released via Effect.ensuring. This also makes the flow work for subsessions and title generation, which both go through the same processor.
Hang after the first fallback. If every model in the chain was on cooldown, pickStart returned a wait decision but the caller proceeded straight into deps.call(primary) without actually waiting — on some providers (notably self-hosted) this hung on a kept-alive socket. Replaced with a real Effect.sleep bounded by WAIT_CAP_MS = 30s, then re-pick the start entry.
Toast spam. The original patch fired both FallbackTriggered and FallbackUsed toasts, and there is also an inline ~> Switching to … notice in the message stream — three notifications for one fallback. Kept only the FallbackTriggered warning toast; dropped the FallbackUsed info toast since the inline notice already shows the same info with attribution to the specific message.

A few smaller things:

sync-v2.tsx handler for session.next.model.updated was using draft.find(...) (first assistant in the session) — changed to activeAssistant(draft) ?? [...].reverse().find(...) so it updates the current message.
Inline notices come in three kinds (using / switch / resume) with different colours so it's obvious whether we're starting on a fallback because primary is cooling down, switched mid-stream after an error, or returning to primary after recovery.
Title fallback uses agent.fallbacks only, not the top-level fallbacks. The top-level list is sized for the main chat model and is typically too expensive for the small title pass; users who want title fallbacks configure them on the title agent.

How did you verify your code works?

Running it locally for a few days against an GPT primary + GLM fallback. Forced primary failures by swapping in a bad API key and by overflowing context; observed correct switching, correct model name in the message header after the switch, no toast spam on subsequent turns, and a "Switched back to …" notice once the primary recovered.

Screenshots / recordings

Checklist

I have tested my changes locally
I have not included unrelated changes in this PR

Signed-off-by: minicx <minicx@disroot.org>

github-actions · 2026-05-16T19:57:54Z

The following comment was made by an LLM, it may be inaccurate:

Based on my search, I found one related PR that should be noted:

Related PR:

feat(opencode): add LLM provider fallback chain #26292 - feat(opencode): add LLM provider fallback chain
- This is the original implementation that PR feat(session): add configurable fallback model chain #27939 builds upon and refactors. The current PR explicitly states "Related: refactors and extends the approach from feat(opencode): add LLM provider fallback chain #26292" and includes three correctness fixes to the initial implementation.

Other potentially relevant PRs (not duplicates):

feat(processor): add model fallback chain when retries are exhausted #24369 - feat(processor): add model fallback chain when retries are exhausted - Earlier approach to fallback chaining at the processor level

The current PR (#27939) is a continuation/improvement of #26292, not a duplicate. It's the successor that fixes issues found in production use of the original feature.

loss-and-quick · 2026-05-16T20:03:19Z

@nexxeln, Can you please review?

Fallback notices (e.g. 'Using GLM-4.5-Air while ... is cooling down') are stored with ignored: true but were not filtered out when converting assistant parts to model messages, causing them to leak into the LLM context and be repeated in responses.

loss-and-quick added 2 commits May 16, 2026 22:53

feat(session): add configurable fallback model chain

7d5e0bc

Signed-off-by: minicx <minicx@disroot.org>

test(fallback): cover bus-driven model swap

676dd74

Signed-off-by: minicx <minicx@disroot.org>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(session): add configurable fallback model chain#27939

feat(session): add configurable fallback model chain#27939
loss-and-quick wants to merge 3 commits into
anomalyco:devfrom
loss-and-quick:feat/fallback-chain

loss-and-quick commented May 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

loss-and-quick commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

loss-and-quick commented May 16, 2026

Issue for this PR

Type of change

What does this PR do?

How did you verify your code works?

Screenshots / recordings

Checklist

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

loss-and-quick commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant