fix: unblock manual chat title generation for unowned chats#26963
fix: unblock manual chat title generation for unowned chats#26963ibetitsmike wants to merge 7 commits into
Conversation
The manual title lock rejected any pending chat and any running chat without a worker with a 409. Legacy pending rows are never acquired by workers, so those chats could never regenerate a title, and running chats are unowned in the normal window between message submission and worker acquisition (indefinitely when runners are down), producing spurious 409s from the Rename dialog's Generate button. - Allow all running chats to generate titles without writing the synthetic lock marker; title persistence already resolves concurrent writes by last write wins. - Give legacy pending chats the synthetic marker path instead of a permanent 409; workers never acquire pending rows, so the marker cannot interfere with chat processing. - Return an acquired flag from acquireManualTitleLock and release the marker only when this request wrote it, so a marker-less request can no longer clear a concurrent request's fresh marker. - Add ErrNoDefaultChatModelConfig and map it to a 400 with a clear message in both title endpoints instead of a generic 500.
The Generate button's error alert showed only the generic response message, hiding the actionable cause carried in the API error detail (for example a misconfigured chat model). Render the detail under the message for real API errors, skipping the generic developer-console fallback for plain errors.
The Rename dialog propose flow (onProposeTitle / requestProposeTitle) is the only live title-generation UX. The regenerate pipeline (outlet wiring, spinner state, and story/test args) had no live caller. The regenerateChatTitle API and query layer are kept for API consumers.
Running chats never take the manual title lock marker, so the release transaction no longer runs for them. Update the internal regenerate and rename tests to assert the unlock transaction is absent.
|
/coder-agents-review |
|
Chat: Review posted | View chat Review historydeep-review v0.9.0 | Round 3 | Last posted: Round 3, 6 findings (1 P2, 2 P3, 3 Nit), APPROVE. Review Finding inventoryFindings
Contested and acknowledged(none) Round logRound 1Panel (18 reviewers). 1 P2, 2 P3, 3 Nit new. 1 Note dropped, 1 P4 dropped (scope creep). Reviewed against cda7a9d..2784fc1. Round 2Churn guard: PROCEED. 6/6 open findings addressed in 7bbc66b. Panel (8 reviewers). 2 Nit new (both dropped, de minimis). All R1 fixes verified. Reviewed against cda7a9d..7bbc66b. Round 3Churn guard: PROCEED. 0 open findings. PR significantly restructured: manual title lock removed entirely, replaced by existing CAS guard in About deep-reviewCRF = Coder Review Finding (P0-P4, Nit, Note)
|
There was a problem hiding this comment.
The lock simplification is well-designed. The old acquireManualTitleLock mixed status-gating with marker-gating in a compound conditional that was hard to audit. The new version separates concerns into three sequential checks, and the acquired return flag eliminates the class of race where a marker-less request could clear a concurrent request's fresh marker. The strict-mock test that proves the absence of unlock transactions for running chats is exactly the right testing shape.
87.6% test density. TestAcquireManualTitleLock covers all decision branches and TestRegenerateChatTitle_RunningChatSkipsMarkerAndRelease uses strict mocks for negative verification. The sentinel error propagation chain is clean through xerrors.Errorf wrapping and errors.Is matching.
1 P2, 2 P3, 3 Nit.
"Proving a negative is harder than proving a positive." (Mafu-san)
coderd/x/chatd/chatd.go:2362
Nit [CRF-7] Doc comment runs ~90 characters on one line. Surrounding doc comments in this file wrap near 80. (Leorio)
🤖
🤖 This review was automatically generated with Coder Agents.
- Use Server.clock instead of time.Now in acquireManualTitleLock and trim its doc comment to the earned invariants (CRF-1, CRF-3). - Read the API error detail directly in RenameChatDialog so responses without a detail field do not surface the generic developer-console hint, with a story covering that case (CRF-2). - Trim change-narrating test comments (CRF-4, CRF-5) and wrap the ProposeChatTitle doc comment (CRF-7).
|
/coder-agents-review |
There was a problem hiding this comment.
All R1 findings addressed in 7bbc66b. Verified each fix against the code:
CRF-1 (P3): time.Now() replaced with p.clock.Now(). All five test Server structs updated with clock: quartz.NewReal(). Fixed the class, not just the instance.
CRF-2 (P2): getErrorDetail replaced with direct error.response.data.detail || undefined read, bypassing the fallback entirely. Three stories now cover all branches (with detail, without detail, plain error). The RenameChatGenerateApiErrorWithoutDetailHidesHint story asserts the developer-console hint is absent.
CRF-3 (P3): Doc comment rewritten to four lines. No status enumeration, no mechanism narration.
CRF-4, CRF-5, CRF-7 (Nit): Test comments trimmed and doc comment wrapped per suggestions.
"Generalized from correction. CRF-1 reported
time.Now()in one location. The fix addedclock: quartz.NewReal()to all five testServerstructs that were missing it." (Mafu-san)
🤖 This review was automatically generated with Coder Agents.
Multiple title generation requests racing each other is not a problem: recordManualTitleUsage re-reads the chat under GetChatByIDForUpdate and only persists the generated title when it is unchanged since the request snapshot, so concurrent regenerates and renames settle by last write wins. The request-level lock only suppressed duplicate model calls, the dialog already disables the button in flight, and the synthetic worker_id marker was the source of the spurious 409s this PR set out to fix. Remove acquire/releaseManualTitleLock, the marker and staleness handling, the 409 mappings, and the now-unused UpdateChatStatusPreserveUpdatedAt query. Add an internal test pinning the compare-and-swap guard as the remaining race protection.
|
/coder-agents-review |
There was a problem hiding this comment.
PR significantly restructured since R2: the manual title lock was removed entirely, replaced by the existing compare-and-swap guard in recordManualTitleUsage. This is a stronger fix than the R1/R2 simplification. The lock tried to prevent duplicate model calls, but the CAS guard already prevents stale titles from clobbering concurrent changes, the UI disables the button in flight, and usage limits bound spend. Removing 150+ lines of lock code, a SQL query, and test complexity for a mechanism whose benefit was already covered elsewhere is the right call.
Verified the CAS guard: GetChatByIDForUpdate acquires a row lock, then lockedChat.Title == chat.Title compares against the request snapshot. TestRegenerateChatTitle_SkipsPersistWhenTitleChangedConcurrently pins this with a strict mock that has no UpdateChatByID expectation, so any persist attempt panics. Concurrent regenerates settle correctly (first writes, second skips). Regenerate concurrent with rename settles correctly (rename wins unconditionally).
All R1 findings remain addressed. The code that had CRF-1 (time.Now()) and CRF-3 (doc comment) was deleted entirely. CRF-2 (getErrorDetail hint) fix is retained with three Storybook stories covering the error rendering matrix. Zero dangling references to any deleted symbol.
89.7% test density. No new findings above Nit across 9 reviewers.
"The lock tried to prevent duplicate model calls, but the CAS guard already prevents stale titles from clobbering concurrent changes, the UI disables the button in flight, and usage limits bound spend." (Pariston)
🤖 This review was automatically generated with Coder Agents.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: eb77802c99
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| chatdCtx := dbauthz.AsChatd(ctx) | ||
| if err := p.acquireManualTitleLock(ctx, chat.ID); err != nil { | ||
| return database.Chat{}, err | ||
| } | ||
| defer p.releaseManualTitleLock(chatdCtx, chat.ID) | ||
|
|
||
| updatedChat, err := p.regenerateChatTitleWithStore( |
There was a problem hiding this comment.
Suppress stale title events after skipped regenerations
Now that RegenerateChatTitle no longer serializes with RenameChatTitle, a rename can land while the model call is in flight. recordManualTitleUsage then correctly skips the DB update when it sees the title changed, but regenerateChatTitleWithStore still publishes a title_change whenever the returned title differs from the original snapshot; if the user renames again before that publish is delivered, the stale skipped-regeneration payload can overwrite the fresher title in the UI because mergeWatchedChatSummary applies title events regardless of updated_at. Please only publish when this regenerate actually wrote the generated title.
Useful? React with 👍 / 👎.
Problem
The Generate button in the chat Rename dialog (POST
/api/experimental/chats/{chat}/title/propose) could fail in ways unrelated to actual concurrent title generation:pendingchat and anyrunningchat without a worker. Legacypendingrows are never acquired by workers, so those chats 409'd forever. Running chats are unowned in the normal window between message submission and worker acquisition (indefinitely when runners are down), producing spurious 409s.Fix
Backend (
coderd/x/chatd,coderd,coderd/database):recordManualTitleUsage, which re-reads the chat underGetChatByIDForUpdateand only persists the generated title when it is unchanged since the request snapshot, so concurrent regenerates and renames settle by last write wins. The lock only suppressed duplicate model calls (the dialog already disables the button in flight, and usage limits bound spend), and its syntheticworker_idmarker was the source of the spurious 409s. The 409 responses, the marker and staleness handling, and the now-unusedUpdateChatStatusPreserveUpdatedAtquery are gone.ErrNoDefaultChatModelConfigsentinel mapped to 400 "No default chat model config is configured." in both title endpoints, matching the POST/chatsprecedent.Frontend (
site):error.response.data.detaildirectly so detail-less API errors do not show the generic developer-console hint.onRegenerateTitleoutlet wiring and theregeneratingTitleChatIdsspinner pipeline). The Rename dialog propose flow is the only live title-generation UX; the endpoint, codersdk methods, and theapi.ts/queries/chats.tslayer are kept for API consumers.Tests
PendingWithoutWorkerexpects 200 for both endpoints,NoDefaultModelConfig(400) subtests, a stopped-workspace propose regression, and anUnauthenticatedpropose subtest.