fix(coderd/x/chatd): drop foreign provider-executed tools on model switch#26555
fix(coderd/x/chatd): drop foreign provider-executed tools on model switch#26555johnstcn wants to merge 6 commits into
Conversation
|
/coder-agents-review |
|
Chat: Review posted | View chat Review historydeep-review v0.8.0 | Round 4 | Last posted: Round 4, 11 findings (6 P3, 1 P4, 3 Nit, 1 Note), APPROVE. Review Finding inventoryFindings
Contested and acknowledgedCRF-2 (P3, provider_switch_sanitize.go:85) - Consecutive same-role messages after row drop
CRF-6 (Nit, commit) - Commit subject over 72 chars
CRF-7 (Nit, commit) - Commit body empty
CRF-10 (Note, provider_switch_sanitize.go:135) - Log missing foreign origin providers
CRF-12 (P3, provider_switch_sanitize.go:136) - Disabled config false-positive stripping
Round logRound 1Panel. 3 P3, 1 P4, 3 Nit, 1 Note posted (8 inline). 1 Note, 1 Note dropped. Reviewed against d5ec26b..ea1d6c9. Round 2BLOCKED. All 8 open findings silent. Reviewed against d5ec26b..80a6a93. Round 3Panel. CRF-1,3,4,8 fixed (verified). CRF-2,6,7 contested and closed by panel. CRF-10 acknowledged. 3 P3 new. 3 Nit dropped. Reviewed against d5ec26b..e339f11. Round 4Panel. CRF-11,13 fixed (verified). CRF-12 contested and closed by panel (7/7 accept). No new findings. All findings resolved. Reviewed against d5ec26b..c46d4ad. About deep-reviewCRF = Coder Review Finding (P0-P4, Nit, Note)
|
This comment has been minimized.
This comment has been minimized.
…ry on provider switch
…ry on provider switch
|
/coder-agents-review |
…ry on provider switch
|
/coder-agents-review |
There was a problem hiding this comment.
Clean, well-scoped fix. The two-layer sanitization design (row-level before conversion, message-level after), provenance via ModelConfigID -> AIProvider.Type, and the fail-closed/fail-open asymmetry are all defensible. Test suite is honest: 7 subtests with real marshal/parse round-tripping, no mocking, and each proves a distinct behavior.
3 P3, 1 P4, 3 Nit, 1 Note.
"I tried to build a case against this and could not." (Pariston)
🤖 This review was automatically generated with Coder Agents.
There was a problem hiding this comment.
Round 2 blocked. 8 open findings from round 1 have no author response and no code fix.
Three fixup commits landed (import formatting, spelling correction, test file rename to _internal_test.go, helper simplification) but none address review findings.
Unaddressed findings:
- CRF-1 (P3) Silent return when target resolution fails; no log
- CRF-2 (P3) Dropped PE-only rows can produce consecutive same-role messages; implicit coupling to downstream sanitizer
- CRF-3 (P3) Wrapper name drops "Executed" from inner function name
- CRF-4 (P4) Comment claims PE blocks only on assistant rows; legacy tool-role PE rows exist
- CRF-6 (Nit) Commit subject 83 chars, over 72-char convention
- CRF-7 (Nit) Commit body empty
- CRF-8 (Nit)
cfgByProvidername inverts mapping direction - CRF-10 (Note) Log missing foreign origin provider names
Further review is blocked until the author responds to existing threads (fix, acknowledge, or contest) or pushes fixes.
🤖 This review was automatically generated with Coder Agents.
…ry on provider switch
|
/coder-agents-review |
There was a problem hiding this comment.
Round 3. CRF-1, CRF-3, CRF-4, CRF-8 verified fixed. CRF-2, CRF-6, CRF-7 contested and closed by panel (defenses hold). CRF-10 acknowledged by author without ticket.
3 new P3.
"Every sentence in the comment restates identifiers visible from the method signature and body." (Gon, on the wrapper doc comment, which 4 other reviewers praised)
CRF-10 (log missing foreign origin providers): author accepted the gap but filed no ticket. This needs a human decision: file a ticket or explicitly accept the gap as permanent.
🤖 This review was automatically generated with Coder Agents.
…ry on provider switch
|
/coder-agents-review |
There was a problem hiding this comment.
Round 4. CRF-11 and CRF-13 verified fixed. CRF-12 contested and closed by panel (7/7 accept: fail-closed with observability is the correct engineering choice for this edge case).
All 16 findings across 4 rounds resolved: 8 fixed, 4 contested and panel-closed, 1 acknowledged, 3 dropped.
No new findings. Clean to merge.
🤖 This review was automatically generated with Coder Agents.
Switching model configs mid-chat must be safe, but starting on Anthropic (which provider-executes
web_search) and switching to Bedrock currently fails: Bedrock shares the Anthropic wire format yet rejectsweb_search_tool_resultblocks with HTTP 400 (CODAGT-471). The same class of failure applies to any provider-executed tool replayed to a provider that did not produce it.This drops provider-executed tool history (calls and results) from assistant rows whose producing model-config provider differs from the target turn's provider, before the prompt is built. Same-provider history is left untouched, so normal
web_searchreplay is unaffected.Approach
ModelConfigID -> AIProvider.TypeviaresolveModelConfigAndNormalizedProvider), not onmodel.Provider()or any fantasy wire signal. This is deliberate: once requests route through aibridged,fantasyConfigForAIBridgeserializes both Anthropic- and Bedrock-typed providers as theanthropicfantasy provider, somodel.Provider()and theProviderOptionswire key both collapse the Anthropic-vs-Bedrock distinction this bug depends on. The model-config provider (derived fromAIProvider.Type) keeps them distinct and survives removal of the direct route.database.ChatMessagerow level inprepareGeneration, beforeConvertMessagesWithFiles, whereModelConfigIDis unambiguous (no fragilepromptRows <-> promptindex alignment).ModelConfigID) fails closed (strip).Testing
stripForeignProviderExecutedToolRows: same-provider no-op, anthropic->bedrock drop, foreign-only row dropped, multi-provider keep-native/strip-foreign, non-provider-executed untouched, empty target no-op, unknown-origin fail-closed.go test ./coderd/x/chatd/ -run TestStripForeignProviderExecutedToolRows,go build ./coderd/x/chatd/...,go vet,gofmtall clean.Non-goals / follow-ups
bedrockprovider hosting Anthropic vs OpenAI underneath) is not distinguished; provider-executedweb_searchoriginates from direct Anthropic/OpenAI today.resolveChatModel; this uses the primary turn'smodelConfig.ID.previous_response_idchain-mode invalidation on provider switch is a separate concern.Implementation plan and decision log
The design was grilled through several iterations before implementation:
target != "anthropic"strip, wired into now-stalechatd.gosites; thereload_messagessite no longer exists onmain).ProviderOptionswire-key sniffing (Option A). Rejected once the aibridge direction surfaced: the wire key collapses Anthropic and Bedrock toanthropic, so it cannot distinguish them.ModelConfigID -> AIProvider.Typeprovenance (authoritative, aibridge-safe) and row-level stripping before conversion (Option C, no index-alignment fragility).Single call site (
generation_preparer.go);chatlooppre-request and reload paths untouched.This pull request was created by Coder Agents on behalf of @johnstcn.