fix: include reasoning tokens in overflow detection#28585
Conversation
The `isOverflow()` function computes context usage by summing `tokens.input + tokens.output + tokens.cache.read + tokens.cache.write`, but `tokens.output` already has reasoning tokens subtracted by `getUsage()` (session.ts). The `tokens.reasoning` field is never added back, causing a systematic under-count proportional to the model's reasoning output. For reasoning-heavy models (GLM-5, Claude with thinking, o1/o3, Gemini), this under-counting is severe enough to prevent auto-compaction from ever triggering, leading to context_length_exceeded errors. Fixes anomalyco#15556 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
The following comment was made by an LLM, it may be inaccurate: Based on my search, I found one potentially related PR that may be addressing similar issues: Related PR:
However, these PRs appear to address different aspects of the overflow/compaction system. The current PR (#28585) specifically fixes the missing Conclusion: No duplicate PRs found. PR #28585 appears to be addressing a specific, previously unhandled bug in the overflow detection logic that the related PRs don't directly cover. |
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
Reasoning models (GLM-5, Claude thinking, o1/o3, Gemini) can get stuck
in repetitive loops when asked to generate compaction summaries. The
thinking/reasoning output interferes with the structured summary
template, causing the model to produce repeated content instead of a
proper summary.
Disable thinking via agent options override:
- `thinking: { type: "disabled" }` for zhipuai/openai-compatible providers
- `thinkingConfig: { includeThoughts: false }` for Google providers
These options are deep-merged after the base provider options in
request.ts, so they properly override the thinking config set in
ProviderTransform.options() without affecting normal agent requests.
Related anomalyco#15556, anomalyco#16903
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Issue for this PR
Closes #15556
Type of change
What does this PR do?
isOverflow()inoverflow.tscomputes context usage astokens.input + tokens.output + tokens.cache.read + tokens.cache.write. Howevertokens.outputalready has reasoning tokens subtracted bygetUsage()(session.ts:414 —output: outputTokens - reasoningTokens). The reasoning tokens are stored separately intokens.reasoningbut never added back into the overflow count.This causes a systematic under-count proportional to reasoning output. For reasoning-heavy models (GLM-5, Claude thinking, o1/o3, Gemini), the under-count is severe enough that auto-compaction never triggers, leading to
context_length_exceedederrors.The fix adds
input.tokens.reasoningto the sum on line 30. Whentokens.totalis available (non-zero), it short-circuits via||so this only affects the fallback path.How did you verify your code works?
getUsage()subtracts reasoning from output at session.ts:414-415isOverflow()was missing reasoning in the count at overflow.ts:30tokens.reasoning = 0, so the addition is a no-op — no behavior changepnpm buildin packages/opencode)Checklist