Skip to content

fix(core): recover v2 context overflow#31003

Open
kitlangton wants to merge 1 commit into
devfrom
feat/core-v2-overflow-recovery
Open

fix(core): recover v2 context overflow#31003
kitlangton wants to merge 1 commit into
devfrom
feat/core-v2-overflow-recovery

Conversation

@kitlangton
Copy link
Copy Markdown
Contributor

Why

V2 automatic compaction estimates request size before provider dispatch, but estimates and configured model limits can be wrong. A provider may still reject the real request after local preflight says it fits.

This PR recognizes provider context-overflow failures, performs one forced compaction, rebuilds the request from the completed checkpoint, and retries exactly once.

Invariant: one logical provider turn gets at most one overflow recovery.

Stage 1: Normalize provider overflow

Provider protocols report the same condition differently. The LLM package now adds an optional provider-neutral classification to existing error surfaces:

classification?: "context-overflow"
sequenceDiagram
  participant Provider
  participant Protocol
  participant LLM
  participant Runner
  Provider-->>Protocol: native error code or message
  Protocol-->>LLM: provider-error + context-overflow
  LLM-->>Runner: typed terminal failure
Loading

The shared classifier covers the existing V1 provider patterns and is reused by:

  • HTTP request failures through InvalidRequestReason
  • OpenAI Responses stream failures
  • Anthropic Messages stream failures
  • Bedrock Converse validation exceptions
  • Legacy V1 API error parsing

Ordinary invalid requests retain their existing behavior and are not compacted.

Compression line: provider-specific detection lives below Session orchestration.

Stage 2: Recover before publishing a failed assistant

The runner defers only a first context-overflow event that arrives before durable assistant content. It then asks compaction to bypass the local estimate while preserving every real safety check.

sequenceDiagram
  participant Runner
  participant Provider
  participant Compactor
  participant History
  Runner->>Provider: original request
  Provider-->>Runner: context overflow
  Runner->>Compactor: compact(trigger = overflow)
  Compactor->>History: publish completed checkpoint
  Runner->>History: reload active context
  Runner->>Provider: rebuilt request
  Provider-->>Runner: answer
Loading

Overflow-triggered compaction bypasses:

  • The local pressure estimate, because the provider has already disproved it.
  • compaction.auto: false, because this path recovers an otherwise terminal provider failure.

It still requires:

  • A valid model context limit
  • Summarizable history
  • A summary request that fits the model window
  • Non-empty textual summary output

If compaction cannot complete, the original overflow is published normally and the identical request is not retried.

Stage 3: Bound the retry

The existing safe-boundary retry reloads projected history and reconstructs the request after compaction. An internal retry marker consumes the overflow budget while unrelated epoch or model-selection retries preserve it.

sequenceDiagram
  participant Attempt1
  participant Checkpoint
  participant Attempt2
  Attempt1->>Checkpoint: complete overflow compaction
  Checkpoint-->>Attempt2: rebuild from durable context
  alt second attempt succeeds
    Attempt2-->>Attempt1: completed answer
  else second attempt overflows
    Attempt2-->>Attempt1: durable terminal failure
  end
Loading

The second overflow is never compacted again. This prevents loops caused by undocumented provider limits, incorrect catalog limits, or a checkpoint that still cannot fit.

Invariant: successful forced compaction consumes the only overflow retry.

Stage 4: Preserve partial-output safety

Provider step-start is now lazy: it does not create a durable assistant until text, reasoning, tool activity, or terminal completion requires one.

sequenceDiagram
  participant Provider
  participant Publisher
  participant History
  Provider->>Publisher: step-start
  Note over Publisher: no durable assistant yet
  alt overflow before output
    Provider-->>Publisher: context overflow
    Publisher-->>History: no failed assistant persisted
  else text or tool activity begins
    Publisher->>History: start durable assistant
    Provider-->>Publisher: context overflow
    Publisher->>History: persist terminal failure, do not retry
  end
Loading

Once assistant output or tool activity is durable, recovery is disabled to avoid duplicate content or side effects.

Verification

  • 118 focused LLM protocol/executor tests
  • 88 focused Core Session runner tests
  • LLM typecheck
  • Core typecheck
  • OpenCode typecheck
  • JavaScript SDK typecheck
  • Workspace pre-push typecheck: 22/22 packages
  • File-scoped oxlint: no errors; existing warnings remain
  • git diff --check

Scope

Included:

  • Typed context-overflow classification
  • Shared provider overflow classifier
  • One forced compaction and rebuilt retry
  • Lazy assistant start for invisible pre-output recovery
  • Terminal failure on second overflow or partial output

Not included:

  • General provider retry policy
  • Rate-limit recovery
  • Manual compaction API
  • Metrics or observability
  • Alternate summary-model fallback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant