Skip to content

fix(coderd/externalauth): save refreshed token before validation (#24332) (backport to 2.29)#24900

Merged
f0ssel merged 2 commits into
release/2.29from
backport/24332-to-2.29
May 1, 2026
Merged

fix(coderd/externalauth): save refreshed token before validation (#24332) (backport to 2.29)#24900
f0ssel merged 2 commits into
release/2.29from
backport/24332-to-2.29

Conversation

@f0ssel
Copy link
Copy Markdown
Member

@f0ssel f0ssel commented May 1, 2026

Backport of #24332 to release/2.29.

Moves the UpdateExternalAuthLink call to immediately after TokenSource.Token() succeeds (before validation). GitHub rotates refresh tokens on use, so if post-refresh validation fails (e.g. rate-limited 403), the new token was previously silently discarded, forcing manual re-authentication.

Original PR: #24332
Merge commit: 2a1984f

Note: This branch includes the cherry-pick of #22904 (optimistic locking) as a prerequisite since #24332's tests depend on it. The #22904 backport PR is #24901. Once that merges, the overlapping commit in this PR will be a no-op.

Cherry-picks applied cleanly with no conflicts.

Generated by Coder Agents

kylecarbs and others added 2 commits May 1, 2026 18:13
…istic locking (#22904)

## Problem

When multiple concurrent callers (e.g., parallel workspace builds) read
the same single-use OAuth2 refresh token from the database and race to
exchange it with the provider, the first caller succeeds but subsequent
callers get `bad_refresh_token`. The losing caller then **clears the
valid new token** from the database, permanently breaking the auth link
until the user manually re-authenticates.

This is reliably reproducible when launching multiple workspaces
simultaneously with GitHub App external auth and user-to-server token
expiration enabled.

## Solution

Two layers of protection:

### 1. Singleflight deduplication (`Config.RefreshToken` +
`ObtainOIDCAccessToken`)

Concurrent callers for the same user/provider share a single refresh
call via `golang.org/x/sync/singleflight`, keyed by `userID`. The
singleflight callback re-reads the link from the database to pick up any
token already refreshed by a prior in-flight call, avoiding redundant
IDP round-trips entirely.

### 2. Optimistic locking on `UpdateExternalAuthLinkRefreshToken`

The SQL `WHERE` clause now includes `AND oauth_refresh_token =
@old_oauth_refresh_token`, so if two replicas (HA) race past
singleflight, the loser's destructive UPDATE is a harmless no-op rather
than overwriting the winner's valid token.

## Changes

| File | Change |
|------|--------|
| `coderd/externalauth/externalauth.go` | Added `singleflight.Group` to
`Config`; split `RefreshToken` into public wrapper +
`refreshTokenInner`; pass `OldOauthRefreshToken` to DB update |
| `coderd/provisionerdserver/provisionerdserver.go` | Wrapped OIDC
refresh in `ObtainOIDCAccessToken` with package-level singleflight |
| `coderd/database/queries/externalauth.sql` | Added optimistic lock
(`WHERE ... AND oauth_refresh_token = @old_oauth_refresh_token`) |
| `coderd/database/queries.sql.go` | Regenerated |
| `coderd/database/querier.go` | Regenerated |
| `coderd/database/dbauthz/dbauthz_test.go` | Updated test params for
new field |
| `coderd/externalauth/externalauth_test.go` | Added
`ConcurrentRefreshDedup` test; updated existing tests for singleflight
DB re-read |

## Testing

- **New test `ConcurrentRefreshDedup`**: 5 goroutines call
`RefreshToken` concurrently, asserts IDP refresh called exactly once,
all callers get same token.
- All existing `TestRefreshToken/*` subtests updated and passing.
- `TestObtainOIDCAccessToken` passing.
- `dbauthz` tests passing.

(cherry picked from commit 53e52ae)
)

GitHub rotates refresh tokens on use, invalidating the old token
immediately. If post-refresh validation fails (e.g. rate-limited
403 from /user), the new token was silently discarded because the
DB save only happened after successful validation. The next refresh
attempt would use the stale refresh token, fail permanently, and
destroy the token.

Move the UpdateExternalAuthLink call to immediately after
TokenSource.Token() succeeds. The post-validation save block is
removed (dead code after the early save). The DB write uses a
detached context (context.WithoutCancel) so a canceled request
cannot prevent persistence of the already-consumed refresh token.

(cherry picked from commit 2a1984f)
@f0ssel f0ssel force-pushed the backport/24332-to-2.29 branch from 406f5cf to 54fa76a Compare May 1, 2026 18:14
@f0ssel f0ssel merged commit 4bb1fd1 into release/2.29 May 1, 2026
24 of 25 checks passed
@f0ssel f0ssel deleted the backport/24332-to-2.29 branch May 1, 2026 18:37
@github-actions github-actions Bot locked and limited conversation to collaborators May 1, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants