Skip to content

fix: unblock manual chat title generation for unowned chats#26963

Draft
ibetitsmike wants to merge 7 commits into
mainfrom
mike/fix-chat-title-generation
Draft

fix: unblock manual chat title generation for unowned chats#26963
ibetitsmike wants to merge 7 commits into
mainfrom
mike/fix-chat-title-generation

Conversation

@ibetitsmike

@ibetitsmike ibetitsmike commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Problem

The Generate button in the chat Rename dialog (POST /api/experimental/chats/{chat}/title/propose) could fail in ways unrelated to actual concurrent title generation:

  • The manual title lock returned 409 for any pending chat and any running chat without a worker. Legacy pending rows are never acquired by workers, so those chats 409'd forever. Running chats are unowned in the normal window between message submission and worker acquisition (indefinitely when runners are down), producing spurious 409s.
  • A missing default chat model config surfaced as a generic 500, and the dialog hid the actionable cause carried in the error detail.

Fix

Backend (coderd/x/chatd, coderd, coderd/database):

  • Remove the manual title lock entirely. Races between title writers are already resolved by recordManualTitleUsage, which re-reads the chat under GetChatByIDForUpdate and only persists the generated title when it is unchanged since the request snapshot, so concurrent regenerates and renames settle by last write wins. The lock only suppressed duplicate model calls (the dialog already disables the button in flight, and usage limits bound spend), and its synthetic worker_id marker was the source of the spurious 409s. The 409 responses, the marker and staleness handling, and the now-unused UpdateChatStatusPreserveUpdatedAt query are gone.
  • New ErrNoDefaultChatModelConfig sentinel mapped to 400 "No default chat model config is configured." in both title endpoints, matching the POST /chats precedent.

Frontend (site):

  • The Rename dialog error alert now renders the API error detail under the message, reading error.response.data.detail directly so detail-less API errors do not show the generic developer-console hint.
  • Removed the dead regenerate-title UI plumbing (onRegenerateTitle outlet wiring and the regeneratingTitleChatIds spinner pipeline). The Rename dialog propose flow is the only live title-generation UX; the endpoint, codersdk methods, and the api.ts/queries/chats.ts layer are kept for API consumers.

Tests

  • chatd internal: a strict-mock test pinning the compare-and-swap guard (a concurrently changed title must not be clobbered by a generated one), plus the existing persist-and-broadcast coverage without lock transactions.
  • HTTP: PendingWithoutWorker expects 200 for both endpoints, NoDefaultModelConfig (400) subtests, a stopped-workspace propose regression, and an Unauthenticated propose subtest.
  • Storybook: stories asserting the API error detail renders in the dialog alert, and that detail-less API errors and plain errors do not leak the developer-console hint.

Authored by Mux on Mike's behalf.

The manual title lock rejected any pending chat and any running chat
without a worker with a 409. Legacy pending rows are never acquired by
workers, so those chats could never regenerate a title, and running
chats are unowned in the normal window between message submission and
worker acquisition (indefinitely when runners are down), producing
spurious 409s from the Rename dialog's Generate button.

- Allow all running chats to generate titles without writing the
  synthetic lock marker; title persistence already resolves concurrent
  writes by last write wins.
- Give legacy pending chats the synthetic marker path instead of a
  permanent 409; workers never acquire pending rows, so the marker
  cannot interfere with chat processing.
- Return an acquired flag from acquireManualTitleLock and release the
  marker only when this request wrote it, so a marker-less request can
  no longer clear a concurrent request's fresh marker.
- Add ErrNoDefaultChatModelConfig and map it to a 400 with a clear
  message in both title endpoints instead of a generic 500.
The Generate button's error alert showed only the generic response
message, hiding the actionable cause carried in the API error detail
(for example a misconfigured chat model). Render the detail under the
message for real API errors, skipping the generic developer-console
fallback for plain errors.
The Rename dialog propose flow (onProposeTitle / requestProposeTitle) is
the only live title-generation UX. The regenerate pipeline (outlet
wiring, spinner state, and story/test args) had no live caller. The
regenerateChatTitle API and query layer are kept for API consumers.
Running chats never take the manual title lock marker, so the release
transaction no longer runs for them. Update the internal regenerate and
rename tests to assert the unlock transaction is absent.
@ibetitsmike

Copy link
Copy Markdown
Collaborator Author

/coder-agents-review

@coder-agents-review

coder-agents-review Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Chat: Review posted | View chat
Requested: 2026-07-02 16:03 UTC by @ibetitsmike
Spend: $83.30 / $100.00

Review history
  • R1 (2026-07-02): 18 reviewers, 3 Nit, 1 P2, 2 P3, COMMENT. Review
  • R2 (2026-07-02): 8 reviewers, 3 Nit, 1 P2, 2 P3, APPROVE. Review
  • R3 (2026-07-02): 9 reviewers, 3 Nit, 1 P2, 2 P3, APPROVE. Review

deep-review v0.9.0 | Round 3 | cda7a9d..eb77802

Last posted: Round 3, 6 findings (1 P2, 2 P3, 3 Nit), APPROVE. Review

Finding inventory

Findings

# Sev Status Location Summary Round Reviewer Posted
CRF-1 P3 Author fixed (7bbc66b) chatd.go:2228 time.Now() bypasses Server.clock in reworked acquireManualTitleLock R1 Netero Yes
CRF-2 P2 Author fixed (7bbc66b) RenameChatDialog.tsx:204 getErrorDetail shows developer-console hint for API errors without detail field R1 Mafuuu P3, Nami P2 Yes
CRF-3 P3 Author fixed (7bbc66b) chatd.go:2214 Doc comment enumerates statuses incompletely (missing paused, completed) and carries mechanism narration R1 Gon P2, Mafuuu Nit Yes
CRF-4 Nit Author fixed (7bbc66b) chatd_internal_test.go:1169 Test doc comment restates function name and repeats marker invariant R1 Gon Yes
CRF-5 Nit Author fixed (7bbc66b) chatd_internal_test.go:908 Comment restates what mock expectations show R1 Gon Yes
CRF-6 Note Dropped by orchestrator (test names already enumerate table) chatd_internal_test.go:1315 Test doc comment enumerates decision table R1 Gon No
CRF-7 Nit Author fixed (7bbc66b) chatd.go:2362 Doc comment line length exceeds project convention R1 Leorio Yes
CRF-8 P4 Dropped by orchestrator (follows established file pattern, scope creep) exp_chats.go:3847 Duplicated error handler chain in regenerate and propose handlers R1 Robin No
CRF-9 Nit Dropped by orchestrator (de minimis, same class as CRF-6) chatd_internal_test.go:1305 Test doc comment enumerates decision table names already cover R2 Gon No
CRF-10 Nit Dropped by orchestrator (de minimis, test helper doc comment) exp_chats_test.go:11031 createTitleGenerationModelConfig doc has mechanism padding R2 Gon No
CRF-11 Nit Dropped by orchestrator (de minimis, both are quartz.NewReal()) chatd_internal_test.go:836 Dual clock instances in test server setup R3 Bisky No

Contested and acknowledged

(none)

Round log

Round 1

Panel (18 reviewers). 1 P2, 2 P3, 3 Nit new. 1 Note dropped, 1 P4 dropped (scope creep). Reviewed against cda7a9d..2784fc1.

Round 2

Churn guard: PROCEED. 6/6 open findings addressed in 7bbc66b. Panel (8 reviewers). 2 Nit new (both dropped, de minimis). All R1 fixes verified. Reviewed against cda7a9d..7bbc66b.

Round 3

Churn guard: PROCEED. 0 open findings. PR significantly restructured: manual title lock removed entirely, replaced by existing CAS guard in recordManualTitleUsage. Panel (9 reviewers). 1 Nit new (dropped, de minimis). Lock removal verified safe by Meruem, Takumi, and Pariston. Reviewed against cda7a9d..eb77802.

About deep-review

CRF = Coder Review Finding (P0-P4, Nit, Note)

Reviewer Focus
Bisky tests
Chopper ops/errors
Churn-guard change verification
Ging language modernization
Gon naming
Hisoka edge cases
Killua perf
Kite change integrity
Knov contracts
Knuckle SQL
Komugi flake/determinism
Kurapika security
Law decomposition
Leorio docs
Luffy product
Mafu-san process
Mafuuu contracts
Melody dispatch/pairing
Meruem structural
Nami frontend
Netero mechanical checks
Pariston premise testing
Pen-botter product gaps
Razor verification
Robin duplication
Ryosuke Go arch
Takumi concurrency
Zoro shape

🤖 Managed by Coder Agents.

@coder-agents-review coder-agents-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock simplification is well-designed. The old acquireManualTitleLock mixed status-gating with marker-gating in a compound conditional that was hard to audit. The new version separates concerns into three sequential checks, and the acquired return flag eliminates the class of race where a marker-less request could clear a concurrent request's fresh marker. The strict-mock test that proves the absence of unlock transactions for running chats is exactly the right testing shape.

87.6% test density. TestAcquireManualTitleLock covers all decision branches and TestRegenerateChatTitle_RunningChatSkipsMarkerAndRelease uses strict mocks for negative verification. The sentinel error propagation chain is clean through xerrors.Errorf wrapping and errors.Is matching.

1 P2, 2 P3, 3 Nit.

"Proving a negative is harder than proving a positive." (Mafu-san)


coderd/x/chatd/chatd.go:2362

Nit [CRF-7] Doc comment runs ~90 characters on one line. Surrounding doc comments in this file wrap near 80. (Leorio)

🤖

🤖 This review was automatically generated with Coder Agents.

Comment thread site/src/pages/AgentsPage/components/ChatsSidebar/dialogs/RenameChatDialog.tsx Outdated
Comment thread coderd/x/chatd/chatd.go Outdated
Comment thread coderd/x/chatd/chatd.go Outdated
Comment thread coderd/x/chatd/chatd_internal_test.go Outdated
Comment thread coderd/x/chatd/chatd_internal_test.go Outdated
- Use Server.clock instead of time.Now in acquireManualTitleLock and
  trim its doc comment to the earned invariants (CRF-1, CRF-3).
- Read the API error detail directly in RenameChatDialog so responses
  without a detail field do not surface the generic developer-console
  hint, with a story covering that case (CRF-2).
- Trim change-narrating test comments (CRF-4, CRF-5) and wrap the
  ProposeChatTitle doc comment (CRF-7).
@ibetitsmike

Copy link
Copy Markdown
Collaborator Author

/coder-agents-review

@coder-agents-review coder-agents-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All R1 findings addressed in 7bbc66b. Verified each fix against the code:

CRF-1 (P3): time.Now() replaced with p.clock.Now(). All five test Server structs updated with clock: quartz.NewReal(). Fixed the class, not just the instance.

CRF-2 (P2): getErrorDetail replaced with direct error.response.data.detail || undefined read, bypassing the fallback entirely. Three stories now cover all branches (with detail, without detail, plain error). The RenameChatGenerateApiErrorWithoutDetailHidesHint story asserts the developer-console hint is absent.

CRF-3 (P3): Doc comment rewritten to four lines. No status enumeration, no mechanism narration.

CRF-4, CRF-5, CRF-7 (Nit): Test comments trimmed and doc comment wrapped per suggestions.

"Generalized from correction. CRF-1 reported time.Now() in one location. The fix added clock: quartz.NewReal() to all five test Server structs that were missing it." (Mafu-san)

🤖 This review was automatically generated with Coder Agents.

Multiple title generation requests racing each other is not a problem:
recordManualTitleUsage re-reads the chat under GetChatByIDForUpdate and
only persists the generated title when it is unchanged since the
request snapshot, so concurrent regenerates and renames settle by last
write wins. The request-level lock only suppressed duplicate model
calls, the dialog already disables the button in flight, and the
synthetic worker_id marker was the source of the spurious 409s this PR
set out to fix.

Remove acquire/releaseManualTitleLock, the marker and staleness
handling, the 409 mappings, and the now-unused
UpdateChatStatusPreserveUpdatedAt query. Add an internal test pinning
the compare-and-swap guard as the remaining race protection.
@ibetitsmike

Copy link
Copy Markdown
Collaborator Author

/coder-agents-review

@coder-agents-review coder-agents-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR significantly restructured since R2: the manual title lock was removed entirely, replaced by the existing compare-and-swap guard in recordManualTitleUsage. This is a stronger fix than the R1/R2 simplification. The lock tried to prevent duplicate model calls, but the CAS guard already prevents stale titles from clobbering concurrent changes, the UI disables the button in flight, and usage limits bound spend. Removing 150+ lines of lock code, a SQL query, and test complexity for a mechanism whose benefit was already covered elsewhere is the right call.

Verified the CAS guard: GetChatByIDForUpdate acquires a row lock, then lockedChat.Title == chat.Title compares against the request snapshot. TestRegenerateChatTitle_SkipsPersistWhenTitleChangedConcurrently pins this with a strict mock that has no UpdateChatByID expectation, so any persist attempt panics. Concurrent regenerates settle correctly (first writes, second skips). Regenerate concurrent with rename settles correctly (rename wins unconditionally).

All R1 findings remain addressed. The code that had CRF-1 (time.Now()) and CRF-3 (doc comment) was deleted entirely. CRF-2 (getErrorDetail hint) fix is retained with three Storybook stories covering the error rendering matrix. Zero dangling references to any deleted symbol.

89.7% test density. No new findings above Nit across 9 reviewers.

"The lock tried to prevent duplicate model calls, but the CAS guard already prevents stale titles from clobbering concurrent changes, the UI disables the button in flight, and usage limits bound spend." (Pariston)

🤖 This review was automatically generated with Coder Agents.

@ibetitsmike

Copy link
Copy Markdown
Collaborator Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eb77802c99

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread coderd/x/chatd/chatd.go
Comment on lines 2178 to 2179
chatdCtx := dbauthz.AsChatd(ctx)
if err := p.acquireManualTitleLock(ctx, chat.ID); err != nil {
return database.Chat{}, err
}
defer p.releaseManualTitleLock(chatdCtx, chat.ID)

updatedChat, err := p.regenerateChatTitleWithStore(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Suppress stale title events after skipped regenerations

Now that RegenerateChatTitle no longer serializes with RenameChatTitle, a rename can land while the model call is in flight. recordManualTitleUsage then correctly skips the DB update when it sees the title changed, but regenerateChatTitleWithStore still publishes a title_change whenever the returned title differs from the original snapshot; if the user renames again before that publish is delivered, the stale skipped-regeneration payload can overwrite the fresher title in the UI because mergeWatchedChatSummary applies title events regardless of updated_at. Please only publish when this regenerate actually wrote the generated title.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant