Skip to content

Surface keyring access failures from ActiveToken#13318

Open
Iron-Ham wants to merge 1 commit into
cli:trunkfrom
Iron-Ham:Iron-Ham/fix-silent-keyring-token-failures
Open

Surface keyring access failures from ActiveToken#13318
Iron-Ham wants to merge 1 commit into
cli:trunkfrom
Iron-Ham:Iron-Ham/fix-silent-keyring-token-failures

Conversation

@Iron-Ham
Copy link
Copy Markdown

@Iron-Ham Iron-Ham commented Apr 29, 2026

Fixes #13317.

Summary

gh api and gh api graphql were silently sending requests without an Authorization header when gh's keyring lookup failed — typically when the 3-second timeout in internal/keyring/keyring.go fired against a macOS Keychain entry whose ACL had drifted (e.g. an entry written by an older gh that now requires an interactive approval the non-TTY child process can't surface). Users saw only HTTP 401 (REST) or HTTP 403 "API rate limit exceeded for <IP>" (GraphQL) with no signal that the local keyring was the cause.

The bug was a silent error discard at two layers:

  1. (*config.AuthConfig).ActiveToken swallowed keyring errors from TokenFromKeyringForUser / TokenFromKeyring and returned ("", source).
  2. AddAuthTokenHeader.RoundTrip then saw an empty token and proceeded with no header — indistinguishable from a legitimate anonymous request.

What changed

  • Added ActiveTokenWithError(hostname) (token, source, err) to gh.AuthConfig and *config.AuthConfig that surfaces non-ErrNotFound keyring errors.
  • Refactored (*AuthConfig).ActiveToken as a thin wrapper that calls ActiveTokenWithError and discards the error, preserving its existing signature for every current caller.
  • Required ActiveTokenWithError on api/http_client.go's local tokenGetter interface; updated the cfg adapter in internal/authflow/flow.go and the tinyConfig test mock to match.
  • AddAuthTokenHeader.RoundTrip now calls ActiveTokenWithError. On non-nil error it fails the request with that error; on nil error + empty token it proceeds anonymously (preserves anonymous-endpoint behavior); on nil error + non-empty token it attaches the Authorization header as before.
  • keyring.ErrNotFound continues to be the legitimate "no token configured for this host" signal and is not surfaced as an error, so anonymous requests against unconfigured hosts keep working unchanged.

What stays the same

  • The gh.AuthConfig.ActiveToken signature and behavior for every existing caller.
  • Test mocks that don't directly satisfy tokenGetter (e.g. stubAuthConfig in pkg/cmd/attestation/trustedroot/trustedroot_test.go) — stubAuthConfig embeds config.AuthConfig so it inherits the new method automatically.
  • Anonymous-endpoint behavior (gh api zen against a host with no configured user still proceeds without an Authorization header).

Alternatives considered

Opt-in type-assertion shim (avoid changing gh.AuthConfig)

Initially this PR added an unexported errorTokenGetter interface in api/http_client.go that RoundTrip would type-assert against, with *config.AuthConfig opportunistically satisfying it. The advantage was zero churn to gh.AuthConfig — no interface change, no mock updates for downstream consumers.

Rejected because the new error-aware behavior would only fire by accident of static type propagation, the contract would be untestable from the interface seam (any future code constructing HTTPClientOptions from a gh.AuthConfig value rather than the concrete type would silently get the legacy path), and an opt-in shim is brittle for a fix to a silent-failure bug. Adding ActiveTokenWithError to the interface makes the contract explicit and verifiable.

Naming: ActiveTokenE vs. ActiveTokenWithError vs. ResolveActiveToken

ActiveTokenE (single-letter E suffix) is concise and used in some Go projects, but has no precedent in cli/cli — would be inventing a convention.

ResolveActiveToken discriminates well from ActiveToken, but pairs awkwardly with the existing method-on-noun naming throughout gh.AuthConfig (ActiveUser, TokenFromKeyring, etc.).

Chose ActiveTokenWithError because it composes naturally with ActiveToken (clear sibling relationship), is self-documenting in error-checking call sites, and matches the existing explicit-suffix precedent in this codebase (e.g. keyring.MockInitWithError in internal/keyring/keyring.go).

Logging from inside ActiveToken (surface error without signature change)

A smaller-footprint alternative: keep ActiveToken unchanged but write a warning to os.Stderr when both keyring lookups fail with non-ErrNotFound errors. The user would see a "warning: keyring access failed" line followed by the downstream 401/403, which is at least diagnosable.

Rejected because (1) logging from a config-layer getter is the wrong architectural seam — it doesn't compose with structured loggers or with cli/cli's --verbose plumbing, (2) the request still goes out unauthenticated, so scripts and CI consumers still get misleading exit codes, and (3) it leaves the underlying contract bug in place for every future caller. Returning the error is the right fix; logging would be a band-aid.

Marking ActiveToken as // Deprecated: in this PR

Could be added as a one-line annotation on the existing method to steer new code toward ActiveTokenWithError.

Deferred to a follow-up PR because adding the marker now would generate SA1019 warnings on every existing call site (pkg/cmd/auth/status/status.go:238, pkg/cmd/auth/refresh/refresh.go:178,229, pkg/cmd/auth/token/token.go:81, pkg/cmd/auth/shared/writeable.go:10, pkg/cmd/config/get/get.go:58, plus internal usages) without giving them a migration path in the same PR. Better to land this fix first, then a follow-up that migrates the call sites and adds the marker together. Happy to do both in one PR if maintainers prefer.

Follow-ups (not in this PR)

Other call sites of ActiveToken would benefit from migrating to ActiveTokenWithError so the same class of silent failure can't recur in those code paths. Left for a follow-up to keep this PR focused:

  • pkg/cmd/auth/status/status.go:238gh auth status can mis-report keyring failures as "no token"
  • pkg/cmd/auth/refresh/refresh.go:178,229
  • pkg/cmd/auth/token/token.go:81
  • pkg/cmd/auth/shared/writeable.go:10
  • pkg/cmd/config/get/get.go:58

Adding // Deprecated: to ActiveToken is also deferred to that follow-up — adding it now would generate SA1019 warnings on every existing call site without giving users a migration path.

Test plan

  • go vet ./... clean.
  • go test ./... -short passes (full suite).
  • New tests in internal/config/auth_config_test.go:
    • TestActiveTokenWithErrorSurfacesKeyringFailure — keyring error is wrapped and returned.
    • TestActiveTokenSilentlyDiscardsKeyringFailureActiveToken (the wrapper) preserves its no-error contract.
    • TestActiveTokenWithErrorDoesNotSurfaceErrNotFoundErrNotFound returns (token="", err=nil).
    • TestActiveTokenWithErrorReturnsEnvTokenWithoutKeyring — env-var path short-circuits keyring entirely.
    • TestActiveTokenWithErrorReturnsKeyringToken — successful keyring resolution returns (token, "keyring", nil).
  • New table-driven TestAddAuthTokenHeaderResolutionPaths in api/http_client_test.go covering: error fails the request, resolved token is attached, empty token without error proceeds anonymously.
  • Manually verified the patched binary built from this branch continues to authenticate correctly against a healthy macOS Keychain (gh api user returns HTTP/2 200, x-ratelimit-limit: 5000, resource: core; same for GraphQL).

@github-actions github-actions Bot added external pull request originating outside of the CLI core team needs-triage needs to be reviewed labels Apr 29, 2026
@Iron-Ham Iron-Ham marked this pull request as ready for review April 29, 2026 22:01
@Iron-Ham Iron-Ham requested a review from a team as a code owner April 29, 2026 22:01
@Iron-Ham Iron-Ham requested a review from babakks April 29, 2026 22:01
@Iron-Ham Iron-Ham force-pushed the Iron-Ham/fix-silent-keyring-token-failures branch from bfb35af to 9d71383 Compare April 30, 2026 20:49
When ActiveToken's keyring lookup fails with a non-ErrNotFound error
(typically *keyring.TimeoutError on macOS Keychain), the error was
silently discarded and AddAuthTokenHeader.RoundTrip then sent the
request without an Authorization header. Users saw a confusing HTTP
401 (REST) or HTTP 403 "API rate limit exceeded for <IP>" (GraphQL)
with no signal that the local keyring was the cause.

Add ActiveTokenWithError to gh.AuthConfig that surfaces these errors.
Refactor ActiveToken as a thin wrapper that calls ActiveTokenWithError
and discards the error so existing callers keep their signature.
Require the new method on api/http_client.go's tokenGetter interface
and update RoundTrip to fail the request with the resolution error
rather than send unauthenticated.

When the user-scoped keyring lookup fails with a real error and the
legacy unkeyed fallback then returns ErrNotFound, preserve the
original error rather than letting the fallback's "not found" mask
it. Otherwise a timeout or permission failure on the canonical
lookup would be reported as if no token were configured.

ErrNotFound continues to be the legitimate "no token configured for
this host" signal and is not surfaced as an error so genuinely
anonymous requests against unconfigured hosts keep working.

Fixes cli#13317
@Iron-Ham Iron-Ham force-pushed the Iron-Ham/fix-silent-keyring-token-failures branch from 9d71383 to 5213965 Compare April 30, 2026 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external pull request originating outside of the CLI core team needs-triage needs to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gh api silently sends unauthenticated requests when keychain access fails

1 participant