feat(coderd): accept delegated API key ID from in-process aibridge callers#25625
feat(coderd): accept delegated API key ID from in-process aibridge callers#25625dannykopping wants to merge 1 commit into
Conversation
…llers aibridged's in-memory transport now requires callers to attach a delegated API key ID via aibridge.WithDelegatedAPIKeyID, enforced at the RoundTrip boundary. The handler uses it to authenticate the request as the named user without the key secret; IsAuthorized validates only liveness for this path. Orthogonal to BYOK: a delegated request still forwards user-supplied LLM credentials and strips the Coder governance token. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
/coder-agents-review |
|
Review posted | Chat Review history
deep-review v0.5.0 | Round 1 | Last posted: Round 1, 10 findings (1 P2, 6 P3, 1 Nit, 2 Note), COMMENT. Review Finding inventoryFindings
Contested and acknowledged(None yet.) Round logRound 1Panel. 1 P2, 6 P3, 1 Nit, 2 Note. Reviewed against ef3f95a..2cda32c. About deep-reviewCRF = Coder Review Finding (P0-P4, Nit, Note)
|
There was a problem hiding this comment.
Clean, well-layered mechanism for delegated API key authentication. The trust boundary design is sound: context values are unexported types unreachable from HTTP, the transport enforces the invariant before the handler runs, the DRPC Authorizer sits behind MemTransportPipe. Test suite covers the security-critical negative cases (secret-hash-mismatch-ignored, empty-identity rejection, both-fields-set, BYOK orthogonality end-to-end).
1 P2, 6 P3, 1 Nit.
The P2 is a factual error in an exported doc comment (claims "revoked" but no revocation check exists). The P3s are observability, test coverage, error disambiguation, and format validation. No functional bugs found.
The full panel confirmed the security model is architecturally correct. Hisoka noted the IsAuthorized RPC has no way to know whether it was called over in-memory or network transport. If the DRPC service is ever exposed over a network, key_id becomes user impersonation with just a 10-character ID. Current architecture makes this safe; a // SECURITY comment on the RPC documenting the transport invariant would protect future refactors.
Proto uses two separate string fields rather than oneof. Discussed by 7 reviewers. Consensus: oneof would be more idiomatic but proto3 oneof scalar quirks exist, and the runtime enforcement with explicit both-set rejection is arguably better than oneof's silent last-wins. Current approach is defensible.
Comment verbosity pattern noted by Gon: "the caller never has the secret" appears 4 times across the diff, and several local comments restate what the code already shows. The minimum-draft suggestions are sound.
"There are no second chances when you do not know the identity of your opponent, but you also cannot win without taking a risk." ~ Pariston
🤖 This review was automatically generated with Coder Agents.
| authReq *proto.IsAuthorizedRequest | ||
| sessionKey string | ||
| ) | ||
| if delegatedID, ok := agplaibridge.DelegatedAPIKeyIDFromContext(ctx); ok { |
There was a problem hiding this comment.
P3 [CRF-1] authMode is set to "centralized" or "byok" at line 54 but never updated when delegation is detected here. The auth_mode slog field on the failure path (line 120) and any downstream log consumers cannot distinguish a delegated (secret-less, key-ID-only) authentication from a normal header-based one.
During an incident involving a compromised or misused key ID, responders cannot filter logs to isolate delegated auth events.
"An operator diagnosing a failed in-process auth attempt sees a mode value that points at the header-based flow, not the context-based delegated flow." ~ Knov
Fix: set authMode to "delegated" (or "delegated+" + authMode) inside this branch, or add a separate slog.F("auth_delegated", true). (Kurapika P3, Kite P3, Knov P3, Mafuuu P3, Meruem Nit)
🤖
| // | ||
| // The caller is responsible for having established that the user owning this | ||
| // key authorized the request: aibridged validates only that the key exists, | ||
| // has not expired, and has not been revoked. It does not verify the key |
There was a problem hiding this comment.
P2 [CRF-2] Doc claims aibridged validates the key "has not been revoked" (line 42), but IsAuthorized never checks revocation. The code path validates: (1) key exists, (2) key not expired, (3) owning user not deleted, (4) owning user not a system user. There is no revocation field on APIKey and no revocation check.
The proto comment on key_id (aibridged.proto:142-144) correctly says "exists, has not expired, and belongs to a non-deleted non-system user." This exported doc should match.
"The Go doc and the proto doc describe the same security boundary with different claims. Someone reading the Go doc will believe key revocation is an explicit validation step." ~ Leorio
Fix: replace "has not expired, and has not been revoked" with "has not expired, and belongs to a non-deleted, non-system user." (Gon P2, Leorio P3)
🤖
| } | ||
|
|
||
| // DelegatedAPIKeyIDFromContext returns the API key ID attached by | ||
| // [WithDelegatedAPIKeyID] and whether one was set. |
There was a problem hiding this comment.
P3 [CRF-3] Doc says "and whether one was set" but DelegatedAPIKeyIDFromContext returns false when an empty string was explicitly passed via WithDelegatedAPIKeyID(ctx, ""). The value was set; the getter says it wasn't. The transport tests exercise this exact edge (TestInMemoryRoundTripper_RequiresDelegatedAPIKeyID/empty).
Fix: say "and whether a non-empty value was set", or have WithDelegatedAPIKeyID reject empty strings with an error/panic. (Gon P3, Leorio Nit)
🤖
| // SessionKey is consumed only by the injected MCP path, which is | ||
| // not available to delegated callers (they have no secret). |
There was a problem hiding this comment.
P3 [CRF-4] When delegated, sessionKey stays "". The MCP proxy factory (mcp.go:83) creates a Coder MCP server proxy with Authorization: Bearer (empty token), which will fail at the upstream MCP server.
"mcpServers.Init(ctx) then attempts a network connection that will fail at the upstream MCP server. Consequence: wasted connection setup and added latency on the first delegated request for any user whose deployment has a Coder MCP config." ~ Meruem
The MCP path is deprecated, so urgency is low. Fix: skip Coder MCP proxy creation when req.SessionKey == "". (Meruem P3, Knov Note, Mafuuu Note)
🤖
| // the secret check and validates only that the key exists, is unexpired, and | ||
| // belongs to a non-deleted non-system user. This is the path used by | ||
| // in-process delegated callers (e.g., chatd) that hold only the key ID. | ||
| func TestAuthorization_Delegated(t *testing.T) { |
There was a problem hiding this comment.
P3 [CRF-5] The test comment (lines 202-205) states the server validates the key "belongs to a non-deleted non-system user," but no TestAuthorization_Delegated subcase exercises that guard on the KeyId path.
The existing TestAuthorization covers deleted-user and system-user rejection for the Key path. If a future change gates the user-status checks behind !delegated (the way the secret check is now gated), no delegated-path test would fail.
"Add cases: mock GetUserByID returning {Deleted: true} expecting ErrDeletedUser, and {IsSystem: true} expecting ErrSystemUser." ~ Bisky
(Bisky P3)
🤖
| delegated bool | ||
| ) | ||
| switch { | ||
| case in.GetKey() != "" && in.GetKeyId() != "": |
There was a problem hiding this comment.
P3 [CRF-6] ErrInvalidKey is returned when both key and key_id are set, but the same sentinel is used for malformed key format and wrong secret. An operator sees "invalid key" and checks the key format; the actual problem is a request-shape error (two mutually exclusive fields set).
"The person at 2 AM sees 'both key and key_id set' and knows exactly what happened." ~ Leorio
Fix: introduce a distinct sentinel, e.g. ErrAmbiguousAuth = xerrors.New("both key and key_id set; exactly one required"). (Leorio P3)
🤖
| case in.GetKeyId() != "": | ||
| keyID = in.GetKeyId() |
There was a problem hiding this comment.
P3 [CRF-7] The standard auth path runs httpmw.SplitAPIToken, which rejects key IDs that are not exactly 10 characters. The delegated path passes the raw string straight to GetAPIKeyByID. A malformed input (UUID instead of key ID, empty-after-trim) produces ErrUnknownKey ("unknown key") instead of ErrInvalidKey ("invalid key").
Functionally safe, but the weaker signal makes debugging harder when an in-process caller passes the wrong identifier type. A one-line length check before the DB call would close the gap. (Hisoka P3)
🤖
| string key = 1; | ||
| // key_id authenticates a request without the secret. Used for delegated | ||
| // calls from in-process callers (e.g., chatd) that have already |
There was a problem hiding this comment.
Nit [CRF-10] PR description says "a trusted in-memory transport (currently: chatd)" but chatd has no reference to WithDelegatedAPIKeyID, TransportFor, or AIBridgeTransportFactory. This is the receiving half of a two-part integration; no production caller exists yet. Change to "(planned: chatd)" or "for use by chatd" to match reality. (Mafu-san Nit)
PS. The panel discussed the two separate string fields vs oneof extensively. The explicit both-set rejection at aibridgedserver.go:577 is arguably better than oneof's silent last-wins semantics, and moving the existing key field into a oneof has backward-compatibility implications. Current approach is defensible.
🤖
|
|
||
| message IsAuthorizedRequest { | ||
| // key is the full "<id>-<secret>" API token presented over HTTP. | ||
| // Mutually exclusive with key_id. |
There was a problem hiding this comment.
Note [CRF-8] The panel discussed two separate string fields vs oneof extensively (7 reviewers). The explicit both-set rejection at aibridgedserver.go:577 is arguably better than oneof's silent last-wins semantics. Proto3 oneof with scalar types has its own quirks (distinguishing unset from default). The runtime enforcement + test (TestAuthorization_Delegated/both fields set) is sufficient. Current approach is defensible. (Meruem P3, Kurapika Note, Kite Nit, Knov Note, Hisoka Note, Pariston Nit)
🤖
| // transport boundary, not in this RPC. | ||
| delegated bool | ||
| ) | ||
| switch { |
There was a problem hiding this comment.
Note [CRF-9] The key_id path skips secret verification entirely. The safety depends on the DRPC Authorizer service being reachable only via MemTransportPipe (aibridged.go:61). The IsAuthorized handler has no way to know whether it was called over in-memory or network transport.
The full panel confirmed the architecture is correct today. If this RPC server is ever exposed over a network boundary, any caller that knows a valid key ID (10 chars, non-secret) can authenticate as the key's owner without the secret. A // SECURITY comment on the RPC documenting the transport invariant would protect future refactors. (Hisoka P2, Meruem Note, Kite Note, Knov Note)
🤖

Allows an
api_key_idto be passed from a trusted in-memory transport (currently:chatd) toaibridgedfor use in authenticating LLM requests.This value can only be passed via context, and all users of the in-memory transport must provide it.
It can be used in conjunction with BYOK headers.