fix(core): recover v2 context overflow by kitlangton · Pull Request #31003 · anomalyco/opencode

kitlangton · 2026-06-05T18:40:20Z

Why

V2 automatic compaction estimates request size before provider dispatch, but estimates and configured model limits can be wrong. A provider may still reject the real request after local preflight says it fits.

This PR recognizes provider context-overflow failures, performs one forced compaction, rebuilds the request from the completed checkpoint, and retries exactly once.

Invariant: one logical provider turn gets at most one overflow recovery.

Stage 1: Normalize provider overflow

Provider protocols report the same condition differently. The LLM package now adds an optional provider-neutral classification to existing error surfaces:

classification?: "context-overflow"

sequenceDiagram
  participant Provider
  participant Protocol
  participant LLM
  participant Runner
  Provider-->>Protocol: native error code or message
  Protocol-->>LLM: provider-error + context-overflow
  LLM-->>Runner: typed terminal failure

The shared classifier covers the existing V1 provider patterns and is reused by:

HTTP request failures through InvalidRequestReason
OpenAI Responses stream failures
Anthropic Messages stream failures
Bedrock Converse validation exceptions
Legacy V1 API error parsing

Ordinary invalid requests retain their existing behavior and are not compacted.

Compression line: provider-specific detection lives below Session orchestration.

Stage 2: Recover before publishing a failed assistant

The runner defers only a first context-overflow event that arrives before durable assistant content. It then asks compaction to bypass the local estimate while preserving every real safety check.

sequenceDiagram
  participant Runner
  participant Provider
  participant Compactor
  participant History
  Runner->>Provider: original request
  Provider-->>Runner: context overflow
  Runner->>Compactor: compact(trigger = overflow)
  Compactor->>History: publish completed checkpoint
  Runner->>History: reload active context
  Runner->>Provider: rebuilt request
  Provider-->>Runner: answer

Overflow-triggered compaction bypasses:

The local pressure estimate, because the provider has already disproved it.
compaction.auto: false, because this path recovers an otherwise terminal provider failure.

It still requires:

A valid model context limit
Summarizable history
A summary request that fits the model window
Non-empty textual summary output

If compaction cannot complete, the original overflow is published normally and the identical request is not retried.

Stage 3: Bound the retry

The existing safe-boundary retry reloads projected history and reconstructs the request after compaction. An internal retry marker consumes the overflow budget while unrelated epoch or model-selection retries preserve it.

sequenceDiagram
  participant Attempt1
  participant Checkpoint
  participant Attempt2
  Attempt1->>Checkpoint: complete overflow compaction
  Checkpoint-->>Attempt2: rebuild from durable context
  alt second attempt succeeds
    Attempt2-->>Attempt1: completed answer
  else second attempt overflows
    Attempt2-->>Attempt1: durable terminal failure
  end

The second overflow is never compacted again. This prevents loops caused by undocumented provider limits, incorrect catalog limits, or a checkpoint that still cannot fit.

Invariant: successful forced compaction consumes the only overflow retry.

Stage 4: Preserve partial-output safety

Provider step-start is now lazy: it does not create a durable assistant until text, reasoning, tool activity, or terminal completion requires one.

sequenceDiagram
  participant Provider
  participant Publisher
  participant History
  Provider->>Publisher: step-start
  Note over Publisher: no durable assistant yet
  alt overflow before output
    Provider-->>Publisher: context overflow
    Publisher-->>History: no failed assistant persisted
  else text or tool activity begins
    Publisher->>History: start durable assistant
    Provider-->>Publisher: context overflow
    Publisher->>History: persist terminal failure, do not retry
  end

Once assistant output or tool activity is durable, recovery is disabled to avoid duplicate content or side effects.

Verification

118 focused LLM protocol/executor tests
88 focused Core Session runner tests
LLM typecheck
Core typecheck
OpenCode typecheck
JavaScript SDK typecheck
Workspace pre-push typecheck: 22/22 packages
File-scoped oxlint: no errors; existing warnings remain
git diff --check

Scope

Included:

Typed context-overflow classification
Shared provider overflow classifier
One forced compaction and rebuilt retry
Lazy assistant start for invisible pre-output recovery
Terminal failure on second overflow or partial output

Not included:

General provider retry policy
Rate-limit recovery
Manual compaction API
Metrics or observability
Alternate summary-model fallback

fix(core): recover v2 context overflow

f8f648e

github-actions Bot added the contributor label Jun 5, 2026

kitlangton closed this Jun 5, 2026

kitlangton reopened this Jun 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): recover v2 context overflow#31003

fix(core): recover v2 context overflow#31003
kitlangton wants to merge 1 commit into
devfrom
feat/core-v2-overflow-recovery

kitlangton commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kitlangton commented Jun 5, 2026

Why

Stage 1: Normalize provider overflow

Stage 2: Recover before publishing a failed assistant

Stage 3: Bound the retry

Stage 4: Preserve partial-output safety

Verification

Scope

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant