Skip to content

fix: backfill legacy Bedrock AI provider rows and stale model config strings#26155

Merged
johnstcn merged 25 commits into
mainfrom
cj/codagt-548-backfill
Jun 11, 2026
Merged

fix: backfill legacy Bedrock AI provider rows and stale model config strings#26155
johnstcn merged 25 commits into
mainfrom
cj/codagt-548-backfill

Conversation

@johnstcn

@johnstcn johnstcn commented Jun 9, 2026

Copy link
Copy Markdown
Member

Bedrock AI providers could disappear from the Agents page unless an Anthropic provider was also configured, because legacy deployments stored Bedrock providers as type=anthropic with Bedrock settings in the settings blob. The frontend also showed Anthropic styling and icons for Bedrock-backed model configs because the stored provider string was "anthropic".

A numbered database migration cannot safely fix this: release/2.34 is at migration 509 while main is at 516, so inserting a migration on both branches creates a hole that breaks the upgrade path from 2.34 to 2.35.

Instead, two idempotent startup backfills run after newAPI() (which applies the dbcrypt wrapper so encrypted settings are readable):

  • BackfillBedrockProviderType: promotes ai_providers rows from type=anthropic with Bedrock settings to type=bedrock.
  • BackfillChatModelConfigProviderStrings: fixes stale chat_model_configs.provider = "anthropic" strings on rows whose linked provider was just promoted.

The UpdateAIProvider query now also writes the type column, so the fix persists on any subsequent PATCH.

Refs CODAGT-548

🤖 Generated by Claude with oversight from a human.

@johnstcn johnstcn self-assigned this Jun 9, 2026
@linear-code

linear-code Bot commented Jun 9, 2026

Copy link
Copy Markdown

CODAGT-548

@johnstcn johnstcn changed the title fix: migrate anthropic+bedrock ai_providers to bedrock fix: backfill legacy Bedrock AI provider rows and stale model config strings Jun 9, 2026
Comment thread cli/server.go Outdated
Comment on lines +1055 to +1061
// Promote legacy type=anthropic rows that carry Bedrock settings.
// Must run after newAPI so options.Database is dbcrypt-wrapped.
coderd.BackfillBedrockProviderType(aibridgeInitCtx, options.Database, logger.Named("aibridge.backfill"))
// Fix stale provider strings on model configs after provider types are corrected.
// Both backfills share aibridgeInitCtx; if the first times out the second
// will be skipped and retried on the next startup.
coderd.BackfillChatModelConfigProviderStrings(aibridgeInitCtx, options.Database, logger.Named("aibridge.backfill"))

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self-review: these should both be fairly quick, but I'm happy to bump aibridgeInitCtx if needed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we should be fine 👍

Comment thread coderd/ai_providers_backfill.go Outdated
Comment on lines +25 to +26
//nolint:gocritic // Startup-only backfill; no user actor is present.
sysCtx := dbauthz.AsSystemRestricted(ctx)

@johnstcn johnstcn Jun 9, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self-review: I'm happy to change this to a different actor but this seems like one of those rare cases where AsSystemRestricted is actually defensible.

@johnstcn johnstcn marked this pull request as ready for review June 10, 2026 08:40
@johnstcn johnstcn requested review from Copilot and dannykopping June 10, 2026 08:40
Comment thread coderd/ai_providers_backfill.go Outdated
@johnstcn johnstcn marked this pull request as draft June 10, 2026 08:42
@johnstcn

Copy link
Copy Markdown
Member Author

/coder-agents-review

@coder-agents-review

coder-agents-review Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Chat: Review in progress | View chat
Requested: 2026-06-11 08:40 UTC by @johnstcn
Spend: $52.70 / $100.00

deep-review v0.7.1 | Round 3 | 75dbc57..1f727d5

Last posted: Round 3, 16 findings (1 P0, 3 P2, 4 P3, 8 Nit), APPROVE. Review

Finding inventory

Finding Inventory

Findings

# Sev Status Location Summary Round Reviewer Posted
CRF-1 P0 Author fixed (7ef3ebe) ai_providers_backfill.go:85 Backfill promotes type=bedrock but seed still assigns type=anthropic; server crashes on next restart R1 Pariston Yes
CRF-2 P2 Author fixed (2b53b54) ai_providers_backfill_test.go:285 SkipsDeletedModelConfig soft-deletes wrong config (last by sort order, not the one just created) R1 Bisky Yes
CRF-3 P2 Author fixed (bfaf643, 8c37323) ai_providers_backfill.go:41 BackfillBedrockProviderType doc comment is 20 lines; half narrates chatd internals irrelevant to callers R1 Gon Yes
CRF-4 P3 Author fixed (a5dbf27) ai_providers_backfill.go:72 BackfillBedrockProviderType produces no log output on successful promotion R1 Leorio, Chopper, Meruem, Knov Yes
CRF-5 P3 Author accepted R2 (trade-off acknowledged; finding said "acceptable as-is") ai_providers.sql:69 UpdateAIProvider now writes type column, removing SQL-level immutability; also enables TOCTOU race with concurrent PATCH R1 Kite P3, Meruem P3, Knov P3, Ryosuke P3 Yes
CRF-6 P3 Author fixed (f403b84) ai_providers_backfill.go:96 No test for sql.ErrNoRows concurrent-deletion branch (Debug log path) R1 Bisky Yes
CRF-7 Nit Author fixed (2c1919f) ai_providers_backfill_test.go:152 Test comment claims {} settings parse cleanly; they trigger "missing _type discriminator" error instead R1 Hisoka Yes
CRF-8 Nit Author fixed (f1f456b) enterprise/coderd/ai_providers_backfill_test.go:27 crypto/rand.Read error check is dead code since Go 1.24 R1 Ging-Go Yes
CRF-9 Nit Author fixed (d8a4938) cli/server.go:1055 Call-site comments restate what the function name already says R1 Gon Yes
CRF-10 Nit Author fixed (27ce22a) ai_providers_backfill.go:17 Doc comment narrates frontend icon selection mechanism irrelevant to callers R1 Gon Yes
CRF-11 Nit Author fixed (60f712b) chatmodelconfigs.sql:148 SQL doc comment restates the WHERE clause; keep only the cast-asymmetry note R1 Gon Yes
CRF-12 Nit Author fixed (3f39a28) ai_providers_backfill_test.go:221 ModelConfigIdempotent asserts not-anthropic but should assert equal-bedrock R1 Bisky Yes
CRF-13 Nit Dropped by orchestrator (subjective naming preference for one-time backfill) ai_providers_backfill.go:24 "Strings" suffix on function name misleading R1 Gon No
CRF-14 Nit Dropped by orchestrator (marginal value for single-use one-time backfill code) ai_providers_backfill.go:28 Hardcoded string literals instead of typed constants R1 Ryosuke No
CRF-15 P3 Dropped by orchestrator (subsumed by CRF-17) ai_providers_migrate_test.go:621 Test comment claims seed creates type=anthropic but seed now creates type=bedrock directly R2 Netero No
CRF-16 Nit Author fixed (3bd9d27) cli/server.go:1057 Blank line between comment and the function call it describes R2 Netero Yes
CRF-17 P2 Author fixed (3bd9d27) ai_providers_migrate_test.go:621 SeedIsIdempotentAfterBedrockBackfill does not exercise the canonical-type normalization that prevents startup crash on upgrade R2 Mafuuu P2, Pariston P2, Knov P2, Meruem P3 Yes
CRF-18 P3 Author accepted R3 (NULL prevented by check constraint; guard is defensive redundancy) ai_providers_backfill_test.go:285 No test for model config with NULL ai_provider_id R2 Bisky Yes
CRF-19 Nit Author fixed (b06bda8) ai_providers_backfill_test.go:345 Test comment claims no error-level logging but testLogger uses IgnoreErrors:true R2 Bisky Yes

Round log

Round 2

Panel (re-review). CRF-1 through CRF-12 all addressed. CRF-5 accepted. 1 P2, 1 P3, 2 Nit new. 1 dropped (CRF-15 subsumed by CRF-17). Reviewed against 75dbc57..8c37323.

Round 3

Netero-only (all findings resolved). CRF-16, CRF-17, CRF-19 fixed. CRF-18 accepted. Netero clean. Reviewed against 75dbc57..1f727d5.

About deep-review

CRF = Coder Review Finding (P0-P4, Nit, Note)

Reviewer Focus
Bisky tests
Chopper ops/errors
Churn-guard change verification
Ging language modernization
Gon naming
Hisoka edge cases
Killua perf
Kite change integrity
Knov contracts
Knuckle SQL
Kurapika security
Law decomposition
Leorio docs
Luffy product
Mafu-san process
Mafuuu contracts
Melody dispatch/pairing
Meruem structural
Nami frontend
Netero mechanical checks
Pariston premise testing
Pen-botter product gaps
Razor verification
Robin duplication
Ryosuke Go arch
Takumi concurrency
Zoro shape

🤖 Managed by Coder Agents.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds idempotent startup-time database backfills to repair legacy “Bedrock stored as Anthropic” AI provider rows and to update stale chat_model_configs.provider strings so the UI and routing reflect the canonical provider type, without introducing a numbered migration that would break cross-branch upgrade paths.

Changes:

  • Add startup backfills to (1) promote legacy ai_providers.type=anthropic rows that contain Bedrock settings to type=bedrock, and (2) update chat_model_configs.provider from "anthropic""bedrock" when linked providers were promoted.
  • Update UpdateAIProvider SQL/query bindings so the type column is written on update, and adjust call sites/tests accordingly.
  • Add DB/sqlc plumbing (query, interface, authz wrapper, metrics wrapper, mocks) and comprehensive tests (including encrypted-settings coverage).

Reviewed changes

Copilot reviewed 13 out of 16 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
enterprise/dbcrypt/dbcrypt_internal_test.go Updates AI provider update tests for new Type parameter.
enterprise/coderd/x/chatd/chatd_test.go Updates test helper to pass provider Type in updates.
enterprise/coderd/ai_providers_backfill_test.go Adds enterprise test ensuring backfill works with encrypted settings.
coderd/x/chatd/chatd_test.go Updates test helper to pass provider Type in updates.
coderd/database/queries/chatmodelconfigs.sql Adds BackfillChatModelConfigProvider query to fix stale provider strings.
coderd/database/queries/ai_providers.sql Updates UpdateAIProvider to also write the type column.
coderd/database/queries.sql.go Regenerates sqlc output for new query and updated UpdateAIProvider params.
coderd/database/querier.go Extends sqlc querier interface to include the backfill query (returns sql.Result).
coderd/database/dbmock/dbmock.go Adds gomock method for BackfillChatModelConfigProvider.
coderd/database/dbmetrics/querymetrics.go Adds query metrics wrapper for the new backfill query.
coderd/database/dbauthz/dbauthz.go Adds authz wrapper for the new backfill query.
coderd/database/dbauthz/dbauthz_test.go Adds method accounting test coverage for the new authz wrapper.
coderd/ai_providers.go Ensures API update path supplies Type to UpdateAIProvider.
coderd/ai_providers_backfill.go Implements the two startup backfill functions.
coderd/ai_providers_backfill_test.go Adds DB-backed and failure-path tests for both backfills.
cli/server.go Wires both backfills into server startup after newAPI() / dbcrypt wrapping.
Files not reviewed (3)
  • coderd/database/dbmetrics/querymetrics.go: Language not supported
  • coderd/database/dbmock/dbmock.go: Language not supported
  • coderd/database/querier.go: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread coderd/ai_providers_backfill.go Outdated
Comment thread coderd/ai_providers_backfill.go

@dannykopping dannykopping left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo a few minor comments

Comment thread cli/server.go Outdated
Comment on lines +1055 to +1061
// Promote legacy type=anthropic rows that carry Bedrock settings.
// Must run after newAPI so options.Database is dbcrypt-wrapped.
coderd.BackfillBedrockProviderType(aibridgeInitCtx, options.Database, logger.Named("aibridge.backfill"))
// Fix stale provider strings on model configs after provider types are corrected.
// Both backfills share aibridgeInitCtx; if the first times out the second
// will be skipped and retried on the next startup.
coderd.BackfillChatModelConfigProviderStrings(aibridgeInitCtx, options.Database, logger.Named("aibridge.backfill"))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we should be fine 👍

UPDATE
ai_providers
SET
type = @type::ai_provider_type,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does present an interesting/awkward semantic whereby types could be changed imperatively by operators later, but this would need API/UI changes so for now I think it's fine.

Comment thread coderd/ai_providers_backfill.go Outdated
Comment thread coderd/ai_providers_backfill.go Outdated
Comment thread coderd/ai_providers_backfill.go Outdated
Comment thread coderd/ai_providers_backfill_test.go
Comment thread coderd/ai_providers_backfill_test.go
Comment thread coderd/ai_providers_backfill_test.go
Comment thread enterprise/coderd/ai_providers_backfill_test.go Outdated

@coder-agents-review coder-agents-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backfill logic is well-structured: idempotent, resilient to partial failure, correctly ordered, and tested at high density (73.9%). The SQL query for model config backfill is atomic and properly constrained. The enterprise dbcrypt test proves the encrypted round-trip works. Test coverage of edge cases (deleted rows, disabled rows, unparsable settings, non-Bedrock Anthropic) is thorough.

Severity breakdown: 1 P0, 1 P2, 3 P3, 6 Nit.

The P0 blocks merge: SeedAIProvidersFromEnv still assigns type=anthropic for legacy Bedrock configs (line 330 of ai_providers_migrate.go). After the backfill promotes the row to type=bedrock, the next startup's seed sees a hash mismatch and the server refuses to start. The seed and the backfill fight over the same row's type, and the seed's drift detection treats the backfill's fix as corruption.

"A 20-line doc comment that narrates chatd's routing internals and the frontend's icon selection is a love letter that arrived in the wrong mailbox." (Gon)

🤖 This review was automatically generated with Coder Agents.

Comment thread coderd/ai_providers_backfill.go
Comment thread coderd/ai_providers_backfill_test.go
Comment thread coderd/ai_providers_backfill.go Outdated
Comment thread coderd/ai_providers_backfill.go
Comment thread coderd/database/queries/ai_providers.sql
Comment thread enterprise/coderd/ai_providers_backfill_test.go Outdated
Comment thread cli/server.go Outdated
Comment thread coderd/ai_providers_backfill.go Outdated
Comment thread coderd/database/queries/chatmodelconfigs.sql Outdated
Comment thread coderd/ai_providers_backfill_test.go Outdated
@johnstcn

Copy link
Copy Markdown
Member Author

/coder-agents-review

@coder-agents-review coder-agents-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 12 R1 findings addressed: 11 fixed, 1 accepted (CRF-5). The P0 fix is solid; the seed now writes type=bedrock directly for legacy Bedrock configs and normalizes existing rows before hash comparison. The test fixes, comment trimming, and success logging are all clean.

Severity breakdown (R2): 1 P2, 1 P3, 2 Nit.

The P2 is a test coverage gap in the CRF-1 fix itself: SeedIsIdempotentAfterBedrockBackfill does not exercise the canonical-type normalization code that prevents a startup crash on upgrade. The test starts from a fresh DB where the seed creates type=bedrock directly, so the normalization branch (ai_providers_migrate.go:129-135) is never reached. Four reviewers independently found this.

Pariston notes a simplification opportunity: if the backfill ran before the seed (instead of after), the normalization code at lines 129-135 would be unnecessary. On a fresh deployment the backfill would be a no-op; on a legacy deployment it would promote before the seed checks hashes. Not a defect, but worth considering.

"Coverage data confirms: the normalization branch at ai_providers_migrate.go:133-135 is executed 0 times across all seed tests." (Pariston)

🤖 This review was automatically generated with Coder Agents.

Comment thread coderd/ai_providers_migrate_test.go Outdated
Comment thread coderd/ai_providers_backfill_test.go
Comment thread cli/server.go
Comment thread coderd/ai_providers_backfill_test.go Outdated

@dannykopping dannykopping left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a mil!

@johnstcn

Copy link
Copy Markdown
Member Author

Pariston notes a simplification opportunity: if the backfill ran before the seed (instead of after), the normalization code at lines 129-135 would be unnecessary.

Discussed this with @dannykopping. While the idea is genuinely interesting, it mixes concerns and may not handle the .0 -> .N upgrade path correctly.

@johnstcn

Copy link
Copy Markdown
Member Author

/coder-agents-review

@coder-agents-review coder-agents-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All findings resolved across three rounds (19 findings total: 14 fixed, 2 accepted, 3 dropped). Netero R3 and panel R3 are clean.

R1 caught a P0 (server crash on restart when seed and backfill fight over the type column) that was fixed promptly. R2 caught a P2 (the regression test for the P0 fix didn't exercise the normalization code it existed to protect) that was also fixed correctly: the test now inserts a pre-upgrade-state row directly rather than going through the updated seed. The author's responses throughout were substantive and thorough.

The backfill logic is solid: idempotent, resilient to partial failure, correctly ordered, and tested at high density (77.5%). The seed normalization handles the upgrade path. CI is green.

🤖 This review was automatically generated with Coder Agents.

@johnstcn johnstcn marked this pull request as ready for review June 11, 2026 09:05
@johnstcn

Copy link
Copy Markdown
Member Author

Pending smoke-test before merge

@johnstcn

Copy link
Copy Markdown
Member Author

@dannykopping smoke-tested and no magic smoke escaped ✅

@johnstcn johnstcn merged commit a4c867f into main Jun 11, 2026
28 checks passed
@johnstcn johnstcn deleted the cj/codagt-548-backfill branch June 11, 2026 14:31
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 11, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants