fix(sdk,core): head-start handover correctness and continuation boot latency#3907
Conversation
…s is registered The turn-0 handover splice only ran on the default accumulation path, so agents registering hydrateMessages lost the warm route's step-1 response: pure-text turns fired onTurnComplete with no assistant message, tool-call turns re-ran step 1 from scratch under a fresh messageId, and the head-start user message never reached the hydrate hook. The first-turn history now reaches hydrateMessages as incoming messages, and the splice runs after both accumulation branches, deduped by the handover messageId.
synthesizeHandoverUIMessage only mapped text and tool-call parts, so an extended-thinking model's step-1 reasoning streamed to the browser but never reached the durable session history: onTurnComplete, chat.history, and reloads all lost it. Reasoning parts now map through with provider metadata so Anthropic thinking signatures survive the UIMessage round trip on hydrate replays.
…on cursor scan The .in resume cursor was found by draining an SSE subscription that only closes after its 5 second inactivity window, and the scan ran twice per continuation boot (once for the replay cursor, once for the subscribe cursor), stalling every continuation around 10 seconds before the first turn. The scan is now a non-blocking records read of the latest turn-complete header, runs at most once per boot, the snapshot and replay reads run concurrently, and chat snapshots carry the cursor so subsequent boots skip the scan entirely.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (38)
WalkthroughThis PR optimizes chat agent boot performance by replacing SSE long-poll cursor discovery with non-blocking record reads and concurrent snapshot/replay operations. It persists the 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint install timed out. The project may have too many dependencies for the sandbox. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…nsiently A scan that threw was treated the same as one that found no cursor, so the resume-cursor block skipped its retry and the live subscription could replay from the start. Only a successful lookup (including a genuine no-cursor-yet answer) is shared now; a throw leaves the retry available.
🦋 Changeset detectedLatest commit: 9042df6 The changes in this PR will be included in the next version bump. This PR includes changesets to release 25 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@trigger.dev/build
trigger.dev
@trigger.dev/core
@trigger.dev/plugins
@trigger.dev/python
@trigger.dev/react-hooks
@trigger.dev/redis-worker
@trigger.dev/rsc
@trigger.dev/schema-to-json
@trigger.dev/sdk
commit: |
Summary
Three related fixes for
chat.headStartand continuation boots, found while investigating customer reports.1.
chat.headStartnow works withhydrateMessages. The turn-0 handover splice only ran on the default accumulation path, so agents registeringhydrateMessagessilently lost the warm route's step-1 response: pure-text turns firedonTurnCompletewith no assistant message (and an empty durable write), tool-call turns re-ran step 1 from scratch under a freshmessageId, and the head-start user message never reached the hydrate hook at all. The first-turn history now reacheshydrateMessagesasincomingMessages, and the splice runs after both accumulation branches, deduplicated by the handovermessageId.2. Reasoning parts survive the handover. The synthesized partial only mapped text and tool-call parts, so an extended-thinking model's step-1 reasoning streamed to the browser but never reached durable history. Reasoning parts now map through with provider metadata, so Anthropic thinking signatures survive a UIMessage round trip on hydrate replays.
3. Continuation boots no longer stall for ~10 seconds. The
.inresume cursor was found by draining an SSE subscription that only closes after its 5 second inactivity window, and the scan ran twice per boot. It is now a non-blocking records read of the latest turn-complete header, runs at most once per boot, the boot reads run concurrently, and chat snapshots carry the cursor so subsequent boots skip the scan entirely. Measured locally on a cancel-then-continue repro: pre-turn continuation latency dropped from ~11s to ~0.5s.Every fix was verified red-green: new unit tests reproduced each failure before the fix, and end-to-end smoke tests against a live local stack covered both handover legs, reasoning persistence with extended thinking (including a follow-up turn that round-trips the persisted signed reasoning back to the provider), and the boot timing comparison.
Rollout
SDK-only; no server change required. A new SDK against a server that does not serialize record headers degrades to the existing no-cursor fallback. Old SDKs ignore the new snapshot field, and new SDKs fall back to the records scan on snapshots written before it existed.