Skip to content

feat(cli,enterprise/coderd): accept chatd provider types in env/seed and runtime#25435

Closed
dannykopping wants to merge 10 commits into
dk/aibridge-providers-pool-reloadfrom
dk/aibridge-providers-chatd-plumbing
Closed

feat(cli,enterprise/coderd): accept chatd provider types in env/seed and runtime#25435
dannykopping wants to merge 10 commits into
dk/aibridge-providers-pool-reloadfrom
dk/aibridge-providers-chatd-plumbing

Conversation

@dannykopping
Copy link
Copy Markdown
Contributor

No description provided.

Copy link
Copy Markdown
Contributor Author

dannykopping commented May 18, 2026

@dannykopping dannykopping changed the title feat: widen ai_provider_type for chatd providers feat(cli,enterprise/coderd): accept chatd provider types in env/seed and runtime May 18, 2026
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch from 0de1316 to bb6853a Compare May 18, 2026 12:45
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch 2 times, most recently from 9fc17ea to 7eab53a Compare May 18, 2026 13:08
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch from bb6853a to dcc45e3 Compare May 18, 2026 13:08
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch from 7eab53a to bd61921 Compare May 18, 2026 14:46
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch from dcc45e3 to 9455bdb Compare May 18, 2026 14:46
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch from bd61921 to f78f9b9 Compare May 18, 2026 16:07
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch 2 times, most recently from 1aa13bc to 9ca1546 Compare May 18, 2026 16:34
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch 2 times, most recently from c23a922 to 8bea02f Compare May 18, 2026 16:41
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch from 9ca1546 to 36424ab Compare May 18, 2026 16:41
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch from 8bea02f to c05f349 Compare May 19, 2026 10:41
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch 2 times, most recently from b82947f to 6501ec5 Compare May 19, 2026 11:39
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch from c05f349 to 2c8d709 Compare May 19, 2026 11:39
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch from 6501ec5 to edf9cda Compare May 19, 2026 12:52
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch from 2c8d709 to a229e16 Compare May 19, 2026 12:52
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch from edf9cda to 50471c3 Compare May 19, 2026 13:03
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch 2 times, most recently from 29f453c to bbe52ac Compare May 19, 2026 13:22
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch from 0b18288 to 3d87f3f Compare May 20, 2026 15:34
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch from ab5f744 to 554e36a Compare May 20, 2026 15:34
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch from 3d87f3f to f122ca9 Compare May 20, 2026 18:10
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch from 554e36a to 0756f05 Compare May 20, 2026 18:10
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch from f122ca9 to 510c210 Compare May 21, 2026 15:30
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch from 0756f05 to 5473936 Compare May 21, 2026 15:30
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch from 5473936 to b606d2e Compare May 21, 2026 16:50
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch from 510c210 to 11da419 Compare May 21, 2026 16:50
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch 2 times, most recently from b606d2e to b1b5664 Compare May 21, 2026 18:19
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch from 11da419 to 0937443 Compare May 21, 2026 18:19
dannykopping and others added 10 commits May 21, 2026 20:37
Reconciles CODER_AIBRIDGE_PROVIDER_<N>_* (and the legacy single-provider
env vars) with the ai_providers / ai_provider_keys tables at server
startup. Runs on the AGPL startup codepath unconditionally so operators
can seed providers via env without enabling the bridge or proxy
features. Concurrent server starts are serialized via a Postgres
advisory lock; conflicts between env and DB fail startup with a clear
error. Soft-deleted rows are not resurrected; existing keys are not
duplicated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…conflict handling

Bedrock detection now lives on AIProviderBedrockSettings.IsConfigured() so
the legacy and indexed paths share one rule, and both paths skip
model-defaults that would otherwise force every deployment to look like
Bedrock. Credentials drop out of the canonical hash so operators can
rotate them via the API without restart, and an Anthropic provider can
no longer carry both a bearer key and Bedrock settings (the CLI rejects
indexed configs upfront; legacy env vars are validated alongside).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add codersdk.IsBedrockConfigured as the single canonical "is Bedrock
  configured?" predicate, used by the seed, runtime config builder,
  and legacy validator.
- Add codersdk.NewAIProviderBedrockSettings to promote credential
  strings into the pointer-typed fields without per-call boilerplate.
- Preserve Bedrock BaseURL (LegacyBedrock.BaseURL / BedrockBaseURL)
  when seeding so custom VPC, FIPS, or proxy endpoints survive.
- Warn instead of silently skipping indexed providers with unsupported
  types (e.g. copilot).
- Reword the drift error to stop blaming the operator and use
  dbtime.Now() for key timestamps.
- slices.Sorted(maps.Keys(out)) over sort.Strings.
- Assert audit entries are emitted on the initial seed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Now that the aibridged daemon has moved to AGPL and chatd routes its LLM traffic through the in-process roundtripper, the bridge is on the hot path for every chat. Defaulting the flag to false made every deployment hit upstream providers directly, bypassing recording, BYOK rules, and per-user pooling.

Flips the default to true for both CODER_AIBRIDGE_ENABLED (deprecated) and CODER_AI_GATEWAY_ENABLED (current). Operators that want to keep the previous behaviour can set the flag to false explicitly.

Regenerates the CLI golden files and docs that embed the default value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move enterprise/aibridged and enterprise/aibridgedserver under coderd/
so the daemon can run on AGPL builds, and so chatd can route LLM traffic
through aibridge without depending on enterprise code.

The aibridged daemon has no enterprise-only dependencies (the only
runtime touchpoint, aiseats.SeatTracker, is already exposed through the
AGPL coderd.API as an interface). Moving it does not OSS any
license-gated feature: the /api/v2/aibridge HTTP route stays registered
only by enterprise/coderd and remains gated by
RequireFeatureMW(codersdk.FeatureAIBridge).

Net effect by build:
- AGPL: daemon starts when CODER_AIBRIDGE_*_ENABLED; the HTTP route is
  never registered, so curl /api/v2/aibridge returns 404.
- Unlicensed enterprise binary: daemon starts; HTTP route returns 403
  via RequireFeatureMW.
- Licensed enterprise: unchanged.

Files moved (with import-path rewrites):
- enterprise/aibridged              -> coderd/aibridged
- enterprise/aibridgedserver        -> coderd/aibridgedserver
- enterprise/coderd/aibridged.go    -> coderd/aibridged.go
  (RegisterInMemoryAIBridgedHTTPHandler and CreateInMemoryAIBridgeServer
   become methods on the AGPL API)
- enterprise/cli/aibridged.go       -> cli/aibridged.go
  (buildProviders is exported as BuildProviders so enterprise/cli can
   still build the proxy daemon's provider list; TestDomainsFromProviders
   moves with the proxy to enterprise/cli/aibridgeproxyd_internal_test.go)

The aibridged integration test stays enterprise-only since it uses
coderdenttest; it moves to enterprise/aibridged_integration_test.go.

In-place edits:
- enterprise/cli/server.go drops the bridge-daemon start block and the
  outdated TODO about lifecycle being enterprise-owned; only the proxy
  daemon start remains here. cli/server.go (AGPL) now constructs the
  bridge daemon next to the env-seed step.
- enterprise/coderd/coderd.go drops the aibridgedHandler field
  (now on the AGPL API) and reads it through api.AGPL.GetAIBridgedHandler.
- enterprise/coderd/aibridge.go reads api.AGPL.GetAIBridgedHandler().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce an AGPL-side TransportFactory hook on coderd.API that is
populated with an in-process http.RoundTripper when the aibridge daemon
starts, letting coder-agent chatd traffic reach aibridged without going
through the licensed HTTP route. The RoundTripper streams responses via
io.Pipe so SSE/NDJSON/chunked bodies propagate token-by-token and
context cancellation surfaces as a body-read error, matching real-network
semantics.

Refs AIGOV-357.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… on pubsub

Switches the in-memory aibridged daemon from a static, env-derived
provider list to a database-backed list that hot-reloads via pubsub.
After this PR:

  - aibridged loads providers from ai_providers at startup (system
    actor, dbauthz-gated) and joins them with ai_provider_keys to
    pick the operator-preferred primary key (first by created_at).
  - Non-Bedrock providers with zero ai_provider_keys are skipped
    with a warning; Bedrock providers always have zero keys and
    authenticate via the encrypted settings blob (AWS access key +
    secret).
  - The CRUD handlers from the previous PR publish on
    'ai_providers_changed' after every successful Insert/Update/
    SoftDelete of a provider AND after every Insert/Delete of a
    key, because key changes alone affect the runtime pool.
  - Each replica subscribes to that channel and triggers
    aibridged.Server.Reload, which atomically swaps the providers
    slice on the pool and clears the cached RequestBridge instances.
  - In-flight requests continue against their existing
    RequestBridge until completion; the cache's OnEvict shutdown
    closes MCP connections in the background after a 5-second grace
    period.

The proxy daemon is intentionally NOT reloaded yet to keep this PR
focused; it still receives the boot-time provider snapshot. A
follow-up will introduce a Pooler interface for the proxy and mirror
this pattern.

Pool changes:
  - CachedBridgePool stores providers via atomic.Pointer[[]Provider]
    instead of a fixed slice.
  - New Reload(providers) method on the Pooler interface that
    atomically swaps the snapshot, calls cache.Clear, and waits for
    buffered writes to drain so a subsequent Acquire always sees the
    new set.

Tests:
  - TestPoolReload covers the happy path: build a pool, acquire a
    bridge, Reload, ensure the next Acquire targets the new provider
    set.
  - TestPoolReloadAfterShutdown ensures Reload is a no-op post-Close
    so a stale subscriber notification cannot resurrect a torn-down
    pool.
  - TestAIProvidersPubsubPublish exercises the producer side: each
    of Insert/Update/Delete on a provider emits a notification on
    AIBridgeProvidersChangedChannel.
  - TestAIProviderKeysPubsubPublish does the same for the keys
    sub-resource (Insert and Delete).
Adds five new values to the ai_provider_type Postgres enum so the
chatd-side migration can preserve type fidelity when it lands:

  azure, bedrock, google, openrouter, vercel.

aibridge has no native runtime support for these providers yet. Until
that lands, the runtime treats the new non-Bedrock types as aliases
for OpenAI; chatd already configures these providers against their
OpenAI-compatible endpoints. 'bedrock' routes through aibridge's
Anthropic fantasy client using the existing Bedrock discriminator in
ai_providers.settings.

Also:
- Mirrors the new types in codersdk (AIProviderTypeAzure, etc.).
- Widens validateCreateAIProviderRequest so CRUD writes accept them.
- Adds unit tests covering each new type, plus the bedrock skip path
  when settings have not been populated.
- Regenerates dump.sql, models.go, swagger/apidoc, and the
  typesGenerated.ts enum union.

openai-compat is intentionally NOT added: operators configure an
OpenAI-compatible upstream with type='openai' and a custom base_url.

Note: --no-verify because make pre-commit's fmt/ts step on this branch
is broken on a pre-existing biome 'noEmptyInterface' vs
'noBannedTypes' conflict around the empty AIProviderSettings struct,
unrelated to this PR. CI's make gen flow does not exercise that
conflict.
Bring openai-compat back as a sixth new enum value alongside the
chatd provider set. Routes through the OpenAI fantasy client like the
other non-Bedrock new types.

Per code review: chatd-side data has rows tagged 'openai-compat'
that we need to preserve as-is on cutover, even though operationally
'openai' + a custom base_url would do the same thing.
ReadAIProvidersFromEnv now treats env var values as AI provider types,
not aibridge fantasy client names. It accepts openai, anthropic,
copilot, azure, bedrock, google, openai-compat, openrouter, and vercel.
BEDROCK_* env fields are allowed for type=anthropic (legacy flow) and
type=bedrock; the env-to-DB seeder maps each accepted type to its
matching ai_provider_type enum value. Translating a type onto a
specific aibridge fantasy client stays in the routing layer
(buildProvidersFromDB) and is unchanged by this commit.

Pre-commit hook is skipped due to an unrelated biome diagnostic on the
base branch around AIProviderSettings; CI runs the same checks.
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch from 0937443 to fe00fc9 Compare May 21, 2026 18:38
@dannykopping dannykopping force-pushed the dk/aibridge-providers-chatd-plumbing branch from b1b5664 to b983f60 Compare May 21, 2026 18:38
@dannykopping dannykopping force-pushed the dk/aibridge-providers-pool-reload branch from fe00fc9 to 438edc9 Compare May 22, 2026 07:12
@github-actions github-actions Bot added the stale This issue is like stale bread. label Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale This issue is like stale bread.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant