feat: remove chat chain mode#26980
Conversation
Docs preview📖 View docs preview for |
…repareMessages hook
|
/coder-agents-review |
|
Chat: Review posted | View chat Review history
deep-review v0.9.0 | Round 1 | Last posted: Round 1, 7 findings (2 P3, 2 Nit, 3 Note), COMMENT. Review Finding inventoryFindings
Contested and acknowledged(none) Round logRound 1Panel (15 reviewers). 0 P0, 0 P1, 0 P2, 2 P3, 2 Nit, 3 Note posted. 5 dropped (comment verbosity subsumed by pattern note). Reviewed against 121107f..e38f363. About deep-reviewCRF = Coder Review Finding (P0-P4, Nit, Note)
|
There was a problem hiding this comment.
Clean, thorough removal of chain mode. The deletion scope is well-chosen: every PreviousResponseID, ChainBroken, chain_broken, ChainModeDisabled, ReloadMessages, DisableChainMode, and PrepareMessages reference is gone from production code. The three dead chatloop hooks were correctly identified as never-wired. Tests pass. The follow-ups (column drop migration, Store option product decision, fantasy fork simplification) are concrete and properly scoped.
Severity count: 2 P3, 2 Nit, 3 Note. No blockers.
Note for operators: the chain_broken label removal from coderd_chatd_stream_retries_total is a metric contract change. Any Prometheus queries or Grafana dashboards referencing that label will need updating at deploy time.
Note: the PR description says "Executed in three phases" but the branch has four commits; the 4th commit removes the dead PrepareMessages hook and inlines single-use test helpers, which the Changes section doesn't mention. Minor description inconsistency.
"I tried to build a case against this and could not." (Pariston)
🤖 This review was automatically generated with Coder Agents.
| OrganizationID: orgID, | ||
| OwnerID: userID, | ||
| APIKeyID: testAPIKeyID(t, db, userID), | ||
| Title: "chain mode test", |
There was a problem hiding this comment.
Nit [CRF-1] Stale hardcoded chat title "chain mode test" in a general-purpose helper.
createChatThroughServer is now used by ~20 tests across retry, compaction, provider-switch, and sanitization flows. The hardcoded title is a leftover from when this function lived in chatd_chainmode_test.go. Change to a generic title like "test chat" or accept a title parameter.
(Netero, Chopper)
🤖
| } | ||
|
|
||
| // TestProviderSwitchSanitizesAndRestoresPEToolHistory verifies the A→B→A | ||
| // provider-switch contract: |
There was a problem hiding this comment.
P3 [CRF-2] Doc comment first line truncated during adjacent test deletion.
The deletion of TestAdvisorChainMode_SnapshotKeepsFullHistory above this comment cut the opening line:
// TestProviderSwitchSanitizesAndRestoresPEToolHistory verifies the A→B→AThe comment now starts mid-sentence with // provider-switch contract:, violating Go doc convention (doc comments on test functions should lead with the function name). Prepend the missing line.
(Hisoka)
🤖
| // buildSystemPrompt applies system-level prompt injections in the | ||
| // canonical order. It is used by both the initial prompt assembly | ||
| // and the ReloadMessages callback to keep them in sync. | ||
| // canonical order for the initial prompt assembly. |
There was a problem hiding this comment.
P3 [CRF-3] Comment says "for the initial prompt assembly" but buildSystemPrompt runs on every turn.
buildSystemPrompt is called from generation_preparer.go:344 inside prepareGeneration, which executes every turn, not only the first. The old comment ("used by both the initial prompt assembly and the ReloadMessages callback") was inaccurate in a different way; this replacement trades one inaccuracy for another.
Suggested fix: drop the scope qualifier.
// buildSystemPrompt applies system-level prompt injections in
// canonical order.(Razor, Gon)
🤖
There was a problem hiding this comment.
| require.Len(t, generationRequests, 2) | ||
| recovered := generationRequests[1] | ||
| require.True(t, anthropicSystemHasEphemeralCacheControl(t, recovered)) | ||
| require.True(t, strings.Contains(string(recovered.System), `"cache_control":{"type":"ephemeral"}`)) |
There was a problem hiding this comment.
Nit [CRF-4] require.True(t, strings.Contains(...)) gives a useless failure message.
On failure, testify prints "Should be true" with no context about what the system prompt actually contained. Compare with line 6907 which correctly uses require.Contains. Use:
require.Contains(t, string(recovered.System), `"cache_control":{"type":"ephemeral"}`)(Leorio)
🤖
| Name: "stream_retries_total", | ||
| Help: "Total LLM stream retries.", | ||
| }, []string{"provider", "model", "kind", "chain_broken"}), | ||
| }, []string{"provider", "model", "kind"}), |
There was a problem hiding this comment.
Note [CRF-6] Removing the chain_broken label from coderd_chatd_stream_retries_total is a metric contract change. Any Prometheus queries, Grafana dashboards, or alerting rules referencing this label will break after deploy. Since chain mode is fully removed and the label would only ever read "false" going forward, the removal is correct.
(Chopper)
🤖
| total_cost_micros, | ||
| runtime_ms, | ||
| provider_response_id | ||
| runtime_ms |
There was a problem hiding this comment.
Note [CRF-5] The orphaned chat_messages.provider_response_id column has no drop-migration ticket. The PR description lists it as a follow-up, but no linked ticket exists. The column is nullable TEXT with no index, no constraints, no active writers (after this PR), and no active readers in chatd. Cost of carrying it is near zero (one bit per row in the null bitmap), but worth a ticket so it doesn't get forgotten.
(Knuckle)
🤖
There was a problem hiding this comment.
Filed CODAGT-748 for the column drop and linked it in the PR description. 🤖 Coder Agents on behalf of @johnstcn
| // cloneProviderOptions returns a copy of opts with pointer entries for known, | ||
| // in-place mutated provider option types replaced by a shallow struct copy. | ||
| // chatloop mutates the OpenAI Responses entry (PreviousResponseID) on | ||
| // chain-mode exit, so sharing the pointer with the parent run would let an | ||
| // advisor call corrupt the parent's chain state. Value fields such as | ||
| // Metadata and Include are still shared with the parent; nothing in this | ||
| // package mutates them, but callers that need true deep-copy semantics must | ||
| // handle those fields explicitly. | ||
| // resetProviderOptionsForNestedCall mutates the OpenAI Responses entry | ||
| // (Store), so sharing the pointer with the parent run would let an advisor | ||
| // call corrupt the parent's options. Value fields such as Metadata and | ||
| // Include are still shared with the parent; nothing in this package mutates | ||
| // them, but callers that need true deep-copy semantics must handle those | ||
| // fields explicitly. |
There was a problem hiding this comment.
Note [CRF-7] Several rewritten comments in this PR lead with mechanism narration the code already shows, burying the substantive content (the trap, the invariant, the why) deeper in the paragraph. This comment is a representative example: the first two lines restate what the function body does; the useful parts are the Store mutation risk and the shared-value-fields caveat.
The pattern also appears in resetProviderOptionsForNestedCall (runtime.go:93), runner.go:46, context_prompt.go:20, and chatloop.go:65. All comments are accurate (and the technical writing is noted as good), but leading with the category-bearing content would make them more concise per project style.
(Gon)
🤖
Removes OpenAI Responses "chain mode" from chatd. Chain mode sent
previous_response_idon follow-up turns instead of replaying full history. The provider periodically loses chain state (ZDR, key rotation, item GC), each new error message format broke the chain-broken classifier, and the recovery path added complexity while still dead-ending in some scenarios. Every turn now replays full input.Closes CODAGT-445.
Changes
chatopenai/responses.go(chain detection, activation, prompt filtering, response ID extraction) and its tests.ChainBrokenclassification inchaterrorand the chatloop retry bookkeeping that disabled chain mode mid-generation.chain_brokenlabel from thecoderd_chatd_stream_retries_totalmetric.chat_messages.provider_response_id; delete the deadClearChatMessageProviderResponseIDsByChatIDquery. The column stays on disk for rollback safety; dropping it is a follow-up migration.ReloadMessagesandDisableChainModechatloop hooks and stale chain-mode comments.Not touched: the aibridge
provider_response_idcolumns (unrelated feature), theStoreprovider option (still a valid OpenAI dashboard-logging knob), and thego.modfantasy fork (carries 13 other changes; only its replay-safety item related to chain mode, and that logic still protects full replay).Follow-ups
chat_messages.provider_response_idafter a release cycle: CODAGT-748.Storeoption now that it has no in-product consumer.coder/fantasynow that no consumer sendsprevious_response_id.Implementation plan
Executed in three phases on this branch:
test(coderd/x/chatd): drop chain mode test coverage): deleted chain-mode tests:chatopenai/responses_test.go, sixTestActiveServer_ChainBroken*tests,TestClassify_ChainBroken*,TestAdvisorChainMode_SnapshotKeepsFullHistory,TestOpenAIResponsesChainMode*.chatd_chainmode_test.gowas renamed tochatd_helpers_test.gobecause it housed shared helpers used by ~20 other test sites.feat(coderd/x/chatd): remove chain mode): deletedchatopenai/responses.goentirely; the chain activation block ingeneration_preparer.go;ChainModeDisabledplumbing and thechain_brokenlog field ingeneration.go;ChainBrokenandchainBrokenClassificationinchaterror; thechain_brokenmetric label;PersistedStep.ProviderResponseIDin chatloop; thePreviousResponseIDreset in chatadvisor (Store=false behavior kept);ProviderResponseIDfrom internal step structs, chatstate params, and theCreateChatMessagesINSERT; the deadClearChatMessageProviderResponseIDsByChatIDquery with its dbauthz wrapper; regenerated DB and metrics artifacts viamake gen.chore(coderd/x/chatd): drop dead ReloadMessages hook and stale chain comments): removed the orphanedReloadMessageschatloop hook and reworded three stale chain-mode comments.Deliberately skipped: shrinking the
RecordStreamRetrysignature (theClassifiedErrorparam encodes a real invariant), merging the chatadvisor clone/reset helpers (the defensive clone is cheap insurance), and any change togeneration_preparer.go'spromptRowsfetch (verified still multi-purpose).Verification:
make build,make gen(no diff on re-run),make fmt,make lint, andgo test ./coderd/x/chatd/... ./coderd/database/dbauthz/...all pass locally with postgres.🤖 Generated by Coder Agents on behalf of @johnstcn.