Skip to content

fix(coderd/x/chatd): wake after auto-promoting queued message#24714

Merged
ibetitsmike merged 3 commits into
mainfrom
mike/chatd-auto-promote-wake
Apr 26, 2026
Merged

fix(coderd/x/chatd): wake after auto-promoting queued message#24714
ibetitsmike merged 3 commits into
mainfrom
mike/chatd-auto-promote-wake

Conversation

@ibetitsmike
Copy link
Copy Markdown
Collaborator

tryAutoPromoteQueuedMessage in processChat's deferred cleanup could set a chat back to pending without waking the processor. The processor only noticed on the next 10ms poll, so under load tests like TestAutoPromoteQueuedMessageFallsBackForInvalidQueuedModelConfigID could time out waiting for the second streaming request (#1500).

Call p.signalWake() after the promoted-message publishes when promotedMessage != nil, matching the pattern used by CreateChat, SendMessage, EditMessage, PromoteQueued, and InterruptChat. Make the regression helper testAutoPromoteQueuedMessageFallback deterministic by setting PendingChatAcquireInterval = time.Hour and synchronizing on a secondRunStarted channel instead of polling requestCount, so the test fails without the wake instead of relying on the 10ms ticker.

Closes coder/internal#1500

Mux is acting on Mike's behalf.

@ibetitsmike
Copy link
Copy Markdown
Collaborator Author

/coder-agents-review

Copy link
Copy Markdown
Contributor

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean, well-scoped fix. The production change completes the pending → signalWake contract that all five other entry points already follow. The test redesign is sharp: disabling the ticker and synchronizing on a channel proves the wake is both necessary and sufficient. Netero verified empirically that reverting the fix causes the test to fail with context timeout.

1 P3, 3 Nits. All minor.

"I tried to build a case against this PR and couldn't. The problem is correctly diagnosed, the solution is proportional, the fix is at the right causal level, and the test proves the fix is both necessary and sufficient." — Pariston


coderd/x/chatd/chatd.go:190

Nit [DEREM-1] This comment enumerates four callers (SendMessage, EditMessage, CreateChat, PromoteQueued), but signalWake is now called from six sites: the four listed, SubmitToolResults (line 2178, pre-existing omission), and the auto-promote path added by this PR (line 5213).

Since this PR adds a new caller, the comment is directly in scope to update. Consider replacing the enumeration with intent:

// wakeCh is signaled whenever a chat transitions to
// pending so the run loop calls processOnce immediately
// instead of waiting for the next ticker.

This avoids the comment going stale each time a new call site appears.

(Hisoka, Mafuuu, Ryosuke, Meruem, Chopper, Knov)

🤖

🤖 This review was automatically generated with Coder Agents.

Comment thread coderd/x/chatd/chatd.go Outdated
Comment thread coderd/x/chatd/chatd_test.go
Comment thread coderd/x/chatd/chatd_test.go Outdated
@ibetitsmike
Copy link
Copy Markdown
Collaborator Author

/coder-agents-review

Copy link
Copy Markdown
Contributor

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All four R1 findings addressed in b2d9c63. The DEREM-1 fix is notably good: rather than updating the caller count from 4 to 6, the comment was generalized to describe the invariant ("signaled whenever a chat transitions to pending"), eliminating the staleness class entirely.

12-reviewer panel, zero new findings. The production fix completes the pending → signalWake contract. The test proves the fix is necessary by construction. Clean.

"I tried to build a case against this and couldn't. The problem is correctly understood: a missing wake signal in the one path that transitions to pending without waking." — Pariston

🤖 This review was automatically generated with Coder Agents.

@ibetitsmike
Copy link
Copy Markdown
Collaborator Author

/coder-agents-review

Copy link
Copy Markdown
Contributor

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

R3 moved signalWake after shouldPublishFinishedChatState, aligning with PromoteQueued's publish-then-wake ordering. Applied the deterministic test pattern to the two remaining sibling tests (TestAutoPromoteQueuedMessagesPreservesPerTurnModelOrder and TestInterruptAutoPromotionIgnoresLaterUsageLimitIncrease), replacing clock.Advance and require.Eventually with channel-gated synchronization. All three auto-promote chain tests now prove signalWake works by construction.

14-reviewer panel, zero code findings across 3 rounds. The production fix is 6 lines, pattern-consistent, and at the right causal level. Tests are deterministic.

"I traced all six signalWake sites. Each is preceded by a transition to ChatStatusPending. No signalWake call exists without a preceding pending transition." — Razor

🤖 This review was automatically generated with Coder Agents.

@ibetitsmike ibetitsmike force-pushed the mike/chatd-auto-promote-wake branch from 775a5e7 to 6eca6cc Compare April 25, 2026 20:18
@ibetitsmike
Copy link
Copy Markdown
Collaborator Author

/coder-agents-review

Copy link
Copy Markdown
Contributor

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase-only round (diff unchanged from R3, new base 056203f). 5-reviewer panel confirmed all prior fixes intact, zero new findings. CI pending.

4 rounds, 0 open findings. The production fix completes the pending-to-signalWake contract. Tests are deterministic by construction.

🤖 This review was automatically generated with Coder Agents.

@ibetitsmike ibetitsmike marked this pull request as ready for review April 26, 2026 01:29
@ibetitsmike ibetitsmike enabled auto-merge (squash) April 26, 2026 01:29
@ibetitsmike ibetitsmike merged commit ed33e28 into main Apr 26, 2026
50 of 52 checks passed
@ibetitsmike ibetitsmike deleted the mike/chatd-auto-promote-wake branch April 26, 2026 09:08
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 26, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

flake: TestAutoPromoteQueuedMessageFallsBackForInvalidQueuedModelConfigID

2 participants