Skip to content

fix(coderd): enforce required external auth on workspace create#26314

Draft
dylanhuff-at-coder wants to merge 1 commit into
mainfrom
dylan/plat-241-api-workspace-creation-bypasses-required-external-auth
Draft

fix(coderd): enforce required external auth on workspace create#26314
dylanhuff-at-coder wants to merge 1 commit into
mainfrom
dylan/plat-241-api-workspace-creation-bypasses-required-external-auth

Conversation

@dylanhuff-at-coder

Copy link
Copy Markdown
Contributor

Problem

Required external auth (coder_external_auth with optional = false) is only enforced by client-side preflight checks in the CLI and dashboard. Creating a workspace directly through the REST API (POST /api/v2/users/{user}/workspaces) succeeds even when the workspace owner has never authenticated with a required provider. Token injection at job acquisition silently skips missing or invalid tokens, so the build "succeeds" with an empty token and produces a broken workspace.

Fix

Workspace creation now validates the workspace owner's external auth server-side:

  • The existing GET /templateversions/{id}/external-auth handler logic is refactored into a reusable templateVersionExternalAuthForUser helper that accepts an explicit user ID. Endpoint behavior is unchanged.
  • createWorkspace resolves the exact template version in use (req.TemplateVersionID, falling back to template.ActiveVersionID) and rejects the request with 403 when any non-optional provider is unauthenticated for the owner. The check runs before any workspace row is inserted and before any prebuilt workspace is claimed, and outside any retrying DB transaction (token refresh makes remote OAuth calls and must not be replayed).
  • The owner is the subject of the check, not the initiator, because build-time token injection uses the owner's links. Reading another user's links requires a system-restricted context (external auth links are personal data under dbauthz).
  • The 403 response carries one validations entry per missing provider (field: "external_auth", detail: <provider-id>), with display names in the human-readable detail (falling back to the provider ID), so API clients can react programmatically and fetch authenticate URLs from the existing preflight endpoint.

Customer-visible behavior (release note worthy)

  • Creating a workspace via the API for a user who has not connected required external auth now returns 403 and no workspace is created. This includes admins creating workspaces on behalf of other users and prebuilt workspace claims.
  • If a template intentionally supports pre-provisioning workspaces for users who have not authenticated yet, mark the provider optional = true in the template.

Follow-ups (intentionally not in this PR)

  • An admin-accessible endpoint to query external auth status for an arbitrary user (today the preflight endpoint only reflects the API key user, so admin-on-behalf-of preflights check the wrong user).
  • Enforcement on workspace start/restart/rebuild, which needs re-auth UX support in the CLI and dashboard first.
  • Restricted provisioner-side defense-in-depth (fail required-provider injection for user-initiated start builds) to close the window between the API check and job acquisition.

Tests

  • New TestCreateWorkspaceExternalAuth in coderd/workspaces_test.go: missing required auth returns 403 with structured validations and no workspace row; owner-vs-initiator (authenticated admin creating for an unauthenticated member is rejected, succeeds after the member authenticates); optional providers do not block; invalid/revoked tokens (via a failing ValidateURL) are treated as unauthenticated; display-name fallback to provider ID.
  • TestWorkspaceDeleteSuspendedUser expectations updated: creation now performs one additional token validation.
  • Prebuild claim coverage is not included; claims go through createWorkspace and are gated by the same check, but exercising a real claim requires the enterprise prebuilds harness, which is disproportionate for this PR.
  • Ran: go test ./coderd -run 'TestCreateWorkspaceExternalAuth|TestTemplateVersionsExternalAuth|TestWorkspaceDeleteSuspendedUser|TestPostWorkspacesByOrganization|TestWorkspace$|TestWorkspacesByUser' (also with -race), go test ./cli -run TestCreateWithGitAuth, make lint/go, make lint/emdash, full pre-commit hooks.
Research and implementation decisions (subagent findings)

Three parallel research passes validated the diagnosis and compared three candidate layers before implementation:

  1. Handler-level enforcement in createWorkspace (chosen). Only layer that can return a 4xx with no workspace row. Covers all createWorkspace callers (direct API, org-member endpoint, tasks/chats creation, prebuild claims) while leaving system paths (prebuild provisioning, autostart lifecycle executor) exempt by construction.
  2. Enforcement inside wsbuilder (rejected for this PR). Build() executes inside a RepeatableRead transaction with up to 5 serialization retries, and both HTTP paths wrap it in an outer transaction. Remote OAuth refresh inside a retried transaction can replay refreshes; GitHub rotates refresh tokens on use, so a replay can permanently invalidate the link. A DB-only variant remains viable for future start-transition enforcement.
  3. Failing provisioner job acquisition (rejected as primary). The API has already returned 201 by acquisition time; unconditional enforcement would also break prebuild provisioning (system user has no links), block stop/delete after token expiry (injection runs for all transitions), and fail autostarts. Acquisition-time failures also produce no logs or notifications. A restricted variant is noted as follow-up defense-in-depth.

Implementation details that came out of review:

  • The check validates the owner because provisionerdserver injects the owner's tokens at job time; the initiator's auth state is irrelevant to what terraform receives.
  • dbauthz.AsSystemRestricted is required to read another user's links (policy.ActionReadPersonal) and carries the mandatory //nolint:gocritic justification enforced by scripts/rules.go.
  • RefreshToken performs the same remote validation the preflight endpoint already does for the dashboard's 1s polling; provider outages fail closed, consistent with job-time behavior.
  • A template version referencing a provider absent from deployment config now returns the same 404 the preflight endpoint returns, matching what the CLI and dashboard already enforce.
  • Legacy template versions imported via ExternalAuthProvidersNames treat all providers as required (Optional=false), consistent with existing client-side preflight semantics.

Fixes PLAT-241.


Note

This PR was generated by Coder Agents on behalf of @dylanhuff-at-coder.

Required external auth was only enforced by client-side preflight checks
in the CLI and UI, so creating a workspace directly through the REST API
succeeded even when the owner had not authenticated with a required
provider. The build then ran with an empty token and produced a broken
workspace.

Workspace creation now validates the workspace owner's external auth
links against the template version's non-optional providers before any
workspace row is inserted or prebuilt workspace is claimed, returning
403 with one validation entry per missing provider. The owner is the
subject of the check, not the initiator, because build-time token
injection uses the owner's links.
@linear-code

linear-code Bot commented Jun 11, 2026

Copy link
Copy Markdown

PLAT-241

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant