Skip to content

fix(opencode): normalize MessageV2 info/part shapes to stop GC death spiral#29029

Open
Nowaker wants to merge 1 commit into
anomalyco:devfrom
Nowaker:submit/perf-memoize-info-part
Open

fix(opencode): normalize MessageV2 info/part shapes to stop GC death spiral#29029
Nowaker wants to merge 1 commit into
anomalyco:devfrom
Nowaker:submit/perf-memoize-info-part

Conversation

@Nowaker
Copy link
Copy Markdown

@Nowaker Nowaker commented May 23, 2026

Issue for this PR

Closes #20285

Type of change

  • Refactor / code improvement

What does this PR do?

Problem

session.prompt.runLoop calls MessageV2.filterCompactedEffect(sessionID) on every iteration of its while (true) loop (prompt.ts:1252), which paginates the session's entire message history and rebuilds a WithParts per row via hydrate()info() / part().

The pre-existing helpers used {...row.data, id, sessionID} spread. The output object's hidden-class (JSC Structure) depended on which optional keys row.data happened to contain. With ~6 optionals on Assistant info (error, summary, structured, variant, finish, time.completed), ~4 on User, and 2-5 per part type, real workloads accumulated 50,000+ distinct Structures in the JSStructureHeap. JSC inline caches went megamorphic; every .role, .cost, .parts access fell to the slow path; GC then walked the shape-diverse graph and JSC::Heap::runEndPhase consumed 35-78% of CPU on the main event-loop thread.

User-visible symptom: opencode-serve pegs one core, the HTTP API stops responding, SSE event delivery stalls, sessions still notionally progress but each runLoop iteration takes seconds. Reproduces after ~5-8 prompts on sessions with thousands of messages.

Diagnosis

perf record with bun-profile-build symbols against the affected process:

35.75 – 71.84%  JSC::Heap::runEndPhase(JSC::GCConductor)        <- GC end-phase
 2.92 –  4.39%  JSC::MarkedBlock::Handle::isLive                <- GC mark
 0.50 –  0.76%  JSC::SlotVisitor::drain                         <- GC mark
 0.32 –  0.56%  JSC::MarkedBlock::Handle::sweep                 <- GC sweep
 ─────────────  ─────
 ~40 - 78% total CPU spent in GC

Heap snapshot at peak load: 51,935 distinct object shape:Structure entries (one to two orders of magnitude beyond what JSC's IC system handles monomorphically), 1.2M plain Object allocations, 590k Function closures, 380k JSLexicalEnvironment scopes. A 4-way bisect of concurrent sessions (each one isolated, others archived) showed every session sustains 69-90% GC overhead alone — the pathology is algorithmic, not session-specific.

Fix

Build each info / part output with an explicit, fixed key list per discriminator (role for Info, type for Part). Every Assistant info has the same Structure regardless of which optional fields are present; likewise for User; likewise for each of the 12 part types. Total Structure count from 50k+ → ~15.

A row-level cache layered on top saves the object allocation itself when (id, time_updated) is unchanged — most messages and parts are immutable once committed. LRU-capped at 200K info / 400K part entries.

Correctness invariants:

  • The output array in hydrate() and the parts array inside each WithParts are still allocated fresh per call. reminders.ts mutates the latest user's parts array to inject synthetic parts, and that mutation must not contaminate the cache. Verified by reading every caller.
  • Cached values themselves (info object, individual Part objects) are not mutated by any caller — confirmed via grep.

How did you verify your code works?

Verified on production opencode-serve (one heavily-used long-running tailscale-bound instance with multiple large sessions in flight):

Before After
Main thread CPU 99-115% sustained average 13.2% (3-10% with one transient ~47% spike during filterCompactedEffect)
Cgroup anon memory ~4 GB ~800 MB
API responsiveness /project timing out at 30 s sub-second

New perf test packages/opencode/test/session/message-v2-shape-stability-perf.test.ts. Self-contained, no DB, deterministic LCG-seeded synthetic data with realistic optional-key variety. 5,000 messages, 200 property-access iterations × 7 properties:

[perf] info() unique Structures:  spread=47  normalized=2
[perf] part() unique Structures:  spread=16  normalized=5
[perf] property-access time:      spread=31.7ms  normalized=17.0ms  (1.9x in tight loop;
                                                                     cumulative GC win in
                                                                     production is far larger)

The wall-clock speedup is modest in a tight loop because JSC's IC degradation accumulates with heap size — the synthetic test only generates 5k objects in a fresh process. The production benefit compounds with heap growth: same workload that previously pegged the event-loop thread at 99-115% now stays under 15%.

Existing tests: 138 / 138 pass (message-v2.test.ts, messages-pagination.test.ts, compaction.test.ts).

Screenshots / recordings

N/A.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

@Nowaker Nowaker requested a review from adamdotdevin as a code owner May 23, 2026 21:09
@github-actions
Copy link
Copy Markdown
Contributor

Hey! Your PR title perf(session): memoize MessageV2.info/part to reduce GC pressure in runLoop doesn't follow conventional commit format.

Please update it to start with one of:

  • feat: or feat(scope): new feature
  • fix: or fix(scope): bug fix
  • docs: or docs(scope): documentation changes
  • chore: or chore(scope): maintenance tasks
  • refactor: or refactor(scope): code refactoring
  • test: or test(scope): adding or updating tests

Where scope is the package name (e.g., app, desktop, opencode).

See CONTRIBUTING.md for details.

@github-actions github-actions Bot added needs:title needs:compliance This means the issue will auto-close after 2 hours. labels May 23, 2026
@Nowaker Nowaker changed the title perf(session): memoize MessageV2.info/part to reduce GC pressure in runLoop fix(opencode): memoize MessageV2.info/part to stop GC death spiral in runLoop May 23, 2026
@github-actions github-actions Bot removed needs:title needs:compliance This means the issue will auto-close after 2 hours. labels May 23, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for updating your PR! It now meets our contributing guidelines. 👍

@Nowaker Nowaker force-pushed the submit/perf-memoize-info-part branch from a8f1ad3 to 46e9666 Compare May 23, 2026 22:27
@Nowaker Nowaker changed the title fix(opencode): memoize MessageV2.info/part to stop GC death spiral in runLoop fix(opencode): normalize MessageV2 info/part shapes to stop GC death spiral May 23, 2026
…spiral

session.prompt.runLoop calls MessageV2.filterCompactedEffect every iteration,
which paginates the session history and rebuilds a WithParts object per row
via hydrate() -> info() / part(). The pre-existing spread form
`{...row.data, id, sessionID}` produced output objects whose hidden-class
(JSC Structure) varied with whichever optional keys row.data contained.

With ~6 optionals on assistant info, ~4 on user info, and 2-5 per part
type, real workloads accumulated 50k+ distinct Structures in the
JSStructureHeap. JSC inline caches went megamorphic; every `.role`,
`.cost`, `.parts` access fell to slow-path lookups; GC ran constantly
to walk the shape-diverse graph. JSC::Heap::runEndPhase consumed 35-78%
of CPU on the main event-loop thread. opencode-serve pegged one core,
API stopped responding, SSE delivery stalled.

Fix: build each info / part output with an EXPLICIT, FIXED key list
per discriminator (role for Info, type for Part). Every Assistant
info has the same Structure regardless of optional-field presence;
likewise for User; likewise for each of the 12 part types. Total
Structure count drops from 50k+ to ~15.

A row-level cache layered on top saves the object allocation itself
when (id, time_updated) is unchanged. LRU-capped at 200K info and
400K part entries.

Verified on production opencode-serve: same workload that previously
sustained 99-115% CPU now averages ~13% (one transient spike to ~47%
during filterCompactedEffect, otherwise 3-10%). Anon memory dropped
4 GB -> 796 MB.

New perf test (test/session/message-v2-shape-stability-perf.test.ts)
generates 5,000 messages with deterministic varied optional-key
subsets and shows:

  - info() Structure count: spread=47 -> normalized=2
  - part() Structure count: spread=16 -> normalized=5
  - Property access on 5k objects, 200 iters x 7 properties:
      spread=31.7ms  normalized=17.0ms  (1.9x speedup in tight loop;
      cumulative GC win in production is far larger)

138 / 138 existing message-v2 + pagination + compaction tests pass.
@Nowaker Nowaker force-pushed the submit/perf-memoize-info-part branch from 46e9666 to 7ee8268 Compare May 23, 2026 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE]: Session loop performance — message cache, tool cache, doom loop optimization, summary debounce, parallel plugin events

1 participant