feat: Provision minimal TokenReview RBAC for OIDC auth and add SSL error logging in token parser#6240
feat: Provision minimal TokenReview RBAC for OIDC auth and add SSL error logging in token parser#6240aniketpalu wants to merge 3 commits intofeast-dev:masterfrom
Conversation
…ror logging in token parser Signed-off-by: Aniket Paluskar <apaluska@redhat.com>
Signed-off-by: Aniket Paluskar <apaluska@redhat.com>
jyejare
left a comment
There was a problem hiding this comment.
-
The Kubernetes auth path sets a status condition (AuthorizationReadyType) to indicate RBAC provisioning success/failure. The OIDC path does not. This means operators have no visibility into whether the OIDC RBAC was actually created.
-
The existing OIDC test (featurestore_controller_oidc_auth_test.go) verifies that the Kubernetes-auth Role/RoleBinding are absent, but it does not verify that the new feast-oidc-token-review ClusterRole and per-instance ClusterRoleBinding are created with the correct rules. It also doesn't test the cleanup path (switching from OIDC to no-auth should delete the CRB).
| func (authz *FeastAuthorization) getLabels() map[string]string { | ||
| return map[string]string{ | ||
| services.NameLabelKey: authz.Handler.FeatureStore.Name, |
There was a problem hiding this comment.
getLabels() stamps the FeatureStore-specific name. But the ClusterRole feast-oidc-token-review is shared across all OIDC FeatureStore instances. The last instance to reconcile overwrites the labels with its own name. This creates misleading audit trails — the ClusterRole appears to belong to one FeatureStore when it actually serves all of them.
Recommendation: Either use instance-independent labels for the shared ClusterRole, or use an aggregated label approach.
| if authz.isOidcAuth() { | ||
| if err := authz.createFeastClusterRole(); err != nil { | ||
| if err := authz.createOidcClusterRole(); err != nil { | ||
| return err | ||
| } | ||
| if err := authz.createFeastClusterRoleBinding(); err != nil { | ||
| if err := authz.createOidcClusterRoleBinding(); err != nil { | ||
| return err | ||
| } | ||
| } else { | ||
| authz.cleanupOidcRbac() | ||
| } |
There was a problem hiding this comment.
🔴 Orphaned Kubernetes auth ClusterRoleBinding when transitioning from Kubernetes auth to OIDC auth
Before this PR, the OIDC auth path in Deploy() called createFeastClusterRoleBinding(), which used the same CRB name as Kubernetes auth (GetFeastName(fs) + "-cluster-binding" at infra/feast-operator/internal/controller/authz/authz.go:444). Switching from Kubernetes to OIDC would therefore update the existing CRB in place via CreateOrUpdate. After this PR, the OIDC path creates a separate CRB with a different name (GetFeastName(fs) + "-oidc-token-review" at line 397). When transitioning from Kubernetes auth to OIDC, the old Kubernetes CRB (feast-{name}-cluster-binding) is never deleted — it has no owner reference (setFeastClusterRoleBinding at line 202 returns nil instead of calling SetControllerReference), so it won't be garbage collected either. The orphaned CRB continues to bind the service account to the broad Kubernetes auth ClusterRole (which grants access to subjectaccessreviews, rolebindings, namespaces, clusterroles, clusterrolebindings per lines 142-178), even though OIDC only needs tokenreviews:create. The PR correctly adds cleanupOidcRbac() for the OIDC→Kubernetes transition (line 21), but there is no symmetric cleanup of the Kubernetes CRB when transitioning away from Kubernetes auth.
Prompt for agents
In Deploy(), when the code enters the non-Kubernetes-auth branch (lines 25-41), it should also clean up the Kubernetes auth ClusterRoleBinding before proceeding with OIDC or no-auth setup. Add a cleanup step similar to cleanupOidcRbac() but targeting the Kubernetes auth CRB. For example, add a new method cleanupKubernetesClusterRbac() that deletes the ClusterRoleBinding named getFeastClusterRoleBindingName() (i.e. GetFeastName(fs) + "-cluster-binding"), and call it right after the existing namespace-scoped cleanup (after line 27, before the isOidcAuth check). This mirrors the pattern already established by cleanupOidcRbac() which is called in the Kubernetes auth path at line 21.
Was this helpful? React with 👍 or 👎 to provide feedback.
What this PR does / why we need it:
When `authz: oidc` is configured, the Feast server delegates Kubernetes service account (SA) tokens to a lightweight TokenReview for validation and namespace extraction. This requires the server SA to have `tokenreviews/create` permission. Previously, this RBAC was not provisioned automatically by the operator for OIDC deployments (only for `authz: kubernetes`), requiring manual ClusterRole creation.Operator: OIDC TokenReview RBAC
The operator now provisions a dedicated
feast-oidc-token-reviewClusterRole and ClusterRoleBinding whenauthz: oidcis configured. The ClusterRole contains exactly one rule:authentication.k8s.io/tokenreviews/createThis is the minimum permission needed for the SA token delegation path. No additional RBAC queries (rolebindings, clusterroles, namespaces) are granted, unlike the
authz: kubernetespath which needs broader permissions forKubernetesTokenParser.Cleanup is handled automatically when switching auth types:
SDK: SSL Error Logging
When
verify_ssl: trueis set but the OIDC provider uses self-signed certificates without a configuredca_cert_path, the server fails to reach the JWKS/discovery endpoints. Previously, this produced a generic "Invalid token" log with no indication of the root cause. The token parser now detects SSL errors in the exception chain and logs a clear, actionable message:This applies to both the discovery endpoint (
_validate_token) and the JWKS endpoint (_decode_token) error paths.Which issue(s) this PR fixes:
Follow up to #6089
Checks
git commit -s)Testing Strategy
Misc