fix(opencode): normalize MessageV2 info/part shapes to stop GC death spiral#29029
Open
Nowaker wants to merge 1 commit into
Open
fix(opencode): normalize MessageV2 info/part shapes to stop GC death spiral#29029Nowaker wants to merge 1 commit into
Nowaker wants to merge 1 commit into
Conversation
Contributor
|
Hey! Your PR title Please update it to start with one of:
Where See CONTRIBUTING.md for details. |
Contributor
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
a8f1ad3 to
46e9666
Compare
…spiral
session.prompt.runLoop calls MessageV2.filterCompactedEffect every iteration,
which paginates the session history and rebuilds a WithParts object per row
via hydrate() -> info() / part(). The pre-existing spread form
`{...row.data, id, sessionID}` produced output objects whose hidden-class
(JSC Structure) varied with whichever optional keys row.data contained.
With ~6 optionals on assistant info, ~4 on user info, and 2-5 per part
type, real workloads accumulated 50k+ distinct Structures in the
JSStructureHeap. JSC inline caches went megamorphic; every `.role`,
`.cost`, `.parts` access fell to slow-path lookups; GC ran constantly
to walk the shape-diverse graph. JSC::Heap::runEndPhase consumed 35-78%
of CPU on the main event-loop thread. opencode-serve pegged one core,
API stopped responding, SSE delivery stalled.
Fix: build each info / part output with an EXPLICIT, FIXED key list
per discriminator (role for Info, type for Part). Every Assistant
info has the same Structure regardless of optional-field presence;
likewise for User; likewise for each of the 12 part types. Total
Structure count drops from 50k+ to ~15.
A row-level cache layered on top saves the object allocation itself
when (id, time_updated) is unchanged. LRU-capped at 200K info and
400K part entries.
Verified on production opencode-serve: same workload that previously
sustained 99-115% CPU now averages ~13% (one transient spike to ~47%
during filterCompactedEffect, otherwise 3-10%). Anon memory dropped
4 GB -> 796 MB.
New perf test (test/session/message-v2-shape-stability-perf.test.ts)
generates 5,000 messages with deterministic varied optional-key
subsets and shows:
- info() Structure count: spread=47 -> normalized=2
- part() Structure count: spread=16 -> normalized=5
- Property access on 5k objects, 200 iters x 7 properties:
spread=31.7ms normalized=17.0ms (1.9x speedup in tight loop;
cumulative GC win in production is far larger)
138 / 138 existing message-v2 + pagination + compaction tests pass.
46e9666 to
7ee8268
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue for this PR
Closes #20285
Type of change
What does this PR do?
Problem
session.prompt.runLoopcallsMessageV2.filterCompactedEffect(sessionID)on every iteration of itswhile (true)loop (prompt.ts:1252), which paginates the session's entire message history and rebuilds aWithPartsper row viahydrate()→info()/part().The pre-existing helpers used
{...row.data, id, sessionID}spread. The output object's hidden-class (JSCStructure) depended on which optional keysrow.datahappened to contain. With ~6 optionals on Assistant info (error,summary,structured,variant,finish,time.completed), ~4 on User, and 2-5 per part type, real workloads accumulated 50,000+ distinct Structures in theJSStructureHeap. JSC inline caches went megamorphic; every.role,.cost,.partsaccess fell to the slow path; GC then walked the shape-diverse graph andJSC::Heap::runEndPhaseconsumed 35-78% of CPU on the main event-loop thread.User-visible symptom: opencode-serve pegs one core, the HTTP API stops responding, SSE event delivery stalls, sessions still notionally progress but each runLoop iteration takes seconds. Reproduces after ~5-8 prompts on sessions with thousands of messages.
Diagnosis
perf recordwithbun-profile-build symbols against the affected process:Heap snapshot at peak load: 51,935 distinct
object shape:Structureentries (one to two orders of magnitude beyond what JSC's IC system handles monomorphically), 1.2M plainObjectallocations, 590kFunctionclosures, 380kJSLexicalEnvironmentscopes. A 4-way bisect of concurrent sessions (each one isolated, others archived) showed every session sustains 69-90% GC overhead alone — the pathology is algorithmic, not session-specific.Fix
Build each
info/partoutput with an explicit, fixed key list per discriminator (rolefor Info,typefor Part). Every Assistant info has the same Structure regardless of which optional fields are present; likewise for User; likewise for each of the 12 part types. Total Structure count from 50k+ → ~15.A row-level cache layered on top saves the object allocation itself when
(id, time_updated)is unchanged — most messages and parts are immutable once committed. LRU-capped at 200K info / 400K part entries.Correctness invariants:
hydrate()and thepartsarray inside eachWithPartsare still allocated fresh per call.reminders.tsmutates the latest user's parts array to inject synthetic parts, and that mutation must not contaminate the cache. Verified by reading every caller.How did you verify your code works?
Verified on production opencode-serve (one heavily-used long-running tailscale-bound instance with multiple large sessions in flight):
/projecttiming out at 30 sNew perf test
packages/opencode/test/session/message-v2-shape-stability-perf.test.ts. Self-contained, no DB, deterministic LCG-seeded synthetic data with realistic optional-key variety. 5,000 messages, 200 property-access iterations × 7 properties:The wall-clock speedup is modest in a tight loop because JSC's IC degradation accumulates with heap size — the synthetic test only generates 5k objects in a fresh process. The production benefit compounds with heap growth: same workload that previously pegged the event-loop thread at 99-115% now stays under 15%.
Existing tests: 138 / 138 pass (
message-v2.test.ts,messages-pagination.test.ts,compaction.test.ts).Screenshots / recordings
N/A.
Checklist