Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Merge branch 'master' into registry-docs
  • Loading branch information
tokoko authored Aug 21, 2024
commit f1037653003b09be5b2f961746c762a167fd533e
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -86,14 +86,14 @@ test-python-unit:
python -m pytest -n 8 --color=yes sdk/python/tests

test-python-integration:
python -m pytest -n 8 --integration --color=yes --durations=10 --timeout=1200 --timeout_method=thread \
python -m pytest -n 4 --integration --color=yes --durations=10 --timeout=1200 --timeout_method=thread \
-k "(not snowflake or not test_historical_features_main)" \
sdk/python/tests

test-python-integration-local:
FEAST_IS_LOCAL_TEST=True \
FEAST_LOCAL_ONLINE_CONTAINER=True \
python -m pytest -n 8 --color=yes --integration --durations=5 --dist loadgroup \
python -m pytest -n 4 --color=yes --integration --durations=10 --timeout=1200 --timeout_method=thread --dist loadgroup \
-k "not test_lambda_materialization and not test_snowflake_materialization" \
sdk/python/tests

Expand Down
3 changes: 3 additions & 0 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,15 @@
* [Write Patterns](getting-started/architecture/write-patterns.md)
* [Feature Transformation](getting-started/architecture/feature-transformation.md)
* [Feature Serving and Model Inference](getting-started/architecture/model-inference.md)
* [Role-Based Access Control (RBAC)](getting-started/architecture/rbac.md)
* [Concepts](getting-started/concepts/README.md)
* [Overview](getting-started/concepts/overview.md)
* [Data ingestion](getting-started/concepts/data-ingestion.md)
* [Entity](getting-started/concepts/entity.md)
* [Feature view](getting-started/concepts/feature-view.md)
* [Feature retrieval](getting-started/concepts/feature-retrieval.md)
* [Point-in-time joins](getting-started/concepts/point-in-time-joins.md)
* [Permission](getting-started/concepts/permission.md)
* [\[Alpha\] Saved dataset](getting-started/concepts/dataset.md)
* [Components](getting-started/components/README.md)
* [Overview](getting-started/components/overview.md)
Expand All @@ -30,6 +32,7 @@
* [Online store](getting-started/components/online-store.md)
* [Batch Materialization Engine](getting-started/components/batch-materialization-engine.md)
* [Provider](getting-started/components/provider.md)
* [Authorization Manager](getting-started/components/authz_manager.md)
* [Third party integrations](getting-started/third-party-integrations.md)
* [FAQ](getting-started/faq.md)

Expand Down
4 changes: 4 additions & 0 deletions docs/getting-started/architecture/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,7 @@
{% content-ref url="model-inference.md" %}
[model-inference.md](model-inference.md)
{% endcontent-ref %}

{% content-ref url="rbac.md" %}
[rbac.md](rbac.md)
{% endcontent-ref %}
4 changes: 4 additions & 0 deletions docs/getting-started/architecture/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,7 @@ typically your Offline Store). We are exploring adding a default streaming engin
write patterns](write-patterns.md) to your application

* We recommend [using Python](language.md) for your Feature Store microservice. As mentioned in the document, precomputing features is the recommended optimal path to ensure low latency performance. Reducing feature serving to a lightweight database lookup is the ideal pattern, which means the marginal overhead of Python should be tolerable. Because of this we believe the pros of Python outweigh the costs, as reimplementing feature logic is undesirable. Java and Go Clients are also available for online feature retrieval.

* [Role-Based Access Control (RBAC)](rbac.md) is a security mechanism that restricts access to resources based on the roles of individual users within an organization. In the context of the Feast, RBAC ensures that only authorized users or groups can access or modify specific resources, thereby maintaining data security and operational integrity.


Binary file added docs/getting-started/architecture/rbac.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
56 changes: 56 additions & 0 deletions docs/getting-started/architecture/rbac.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Role-Based Access Control (RBAC) in Feast

## Introduction

Role-Based Access Control (RBAC) is a security mechanism that restricts access to resources based on the roles of individual users within an organization. In the context of the Feast, RBAC ensures that only authorized users or groups can access or modify specific resources, thereby maintaining data security and operational integrity.

## Functional Requirements

The RBAC implementation in Feast is designed to:

- **Assign Permissions**: Allow administrators to assign permissions for various operations and resources to users or groups based on their roles.
- **Seamless Integration**: Integrate smoothly with existing business code without requiring significant modifications.
- **Backward Compatibility**: Maintain support for non-authorized models as the default to ensure backward compatibility.

## Business Goals

The primary business goals of implementing RBAC in the Feast are:

1. **Feature Sharing**: Enable multiple teams to share the feature store while ensuring controlled access. This allows for collaborative work without compromising data security.
2. **Access Control Management**: Prevent unauthorized access to team-specific resources and spaces, governing the operations that each user or group can perform.

## Reference Architecture

Feast operates as a collection of connected services, each enforcing authorization permissions. The architecture is designed as a distributed microservices system with the following key components:

- **Service Endpoints**: These enforce authorization permissions, ensuring that only authorized requests are processed.
- **Client Integration**: Clients authenticate with feature servers by attaching authorization token to each request.
- **Service-to-Service Communication**: This is always granted.

![rbac.jpg](rbac.jpg)

## Permission Model

The RBAC system in Feast uses a permission model that defines the following concepts:

- **Resource**: An object within Feast that needs to be secured against unauthorized access.
- **Action**: A logical operation performed on a resource, such as Create, Describe, Update, Delete, Read, or write operations.
- **Policy**: A set of rules that enforce authorization decisions on resources. The default implementation uses role-based policies.



## Authorization Architecture

The authorization architecture in Feast is built with the following components:

- **Token Extractor**: Extracts the authorization token from the request header.
- **Token Parser**: Parses the token to retrieve user details.
- **Policy Enforcer**: Validates the secured endpoint against the retrieved user details.
- **Token Injector**: Adds the authorization token to each secured request header.







4 changes: 4 additions & 0 deletions docs/getting-started/components/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,7 @@
{% content-ref url="provider.md" %}
[provider.md](provider.md)
{% endcontent-ref %}

{% content-ref url="authz_manager.md" %}
[authz_manager.md](authz_manager.md)
{% endcontent-ref %}
102 changes: 102 additions & 0 deletions docs/getting-started/components/authz_manager.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Authorization Manager
An Authorization Manager is an instance of the `AuthManager` class that is plugged into one of the Feast servers to extract user details from the current request and inject them into the [permissions](../../getting-started/concepts/permissions.md) framework.

{% hint style="info" %}
**Note**: Feast does not provide authentication capabilities; it is the client's responsibility to manage the authentication token and pass it to
the Feast server, which then validates the token and extracts user details from the configured authentication server.
{% endhint %}

Two authorization managers are supported out-of-the-box:
* One using a configurable OIDC server to extract the user details.
* One using the Kubernetes RBAC resources to extract the user details.

These instances are created when the Feast servers are initialized, according to the authorization configuration defined in
their own `feature_store.yaml`.

Feast servers and clients must have consistent authorization configuration, so that the client proxies can automatically inject
the authorization tokens that the server can properly identify and use to enforce permission validations.


## Design notes
The server-side implementation of the authorization functionality is defined [here](./../../../sdk/python/feast/permissions/server).
Few of the key models, classes to understand the authorization implementation on the client side can be found [here](./../../../sdk/python/feast/permissions/client).

## Configuring Authorization
The authorization is configured using a dedicated `auth` section in the `feature_store.yaml` configuration.

**Note**: As a consequence, when deploying the Feast servers with the Helm [charts](../../../infra/charts/feast-feature-server/README.md),
the `feature_store_yaml_base64` value must include the `auth` section to specify the authorization configuration.

### No Authorization
This configuration applies the default `no_auth` authorization:
```yaml
project: my-project
auth:
type: no_auth
...
```

### OIDC Authorization
With OIDC authorization, the Feast client proxies retrieve the JWT token from an OIDC server (or [Identity Provider](https://openid.net/developers/how-connect-works/))
and append it in every request to a Feast server, using an [Authorization Bearer Token](https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#bearer).

The server, in turn, uses the same OIDC server to validate the token and extract the user roles from the token itself.

Some assumptions are made in the OIDC server configuration:
* The OIDC token refers to a client with roles matching the RBAC roles of the configured `Permission`s (*)
* The roles are exposed in the access token passed to the server

(*) Please note that **the role match is case-sensitive**, e.g. the name of the role in the OIDC server and in the `Permission` configuration
must be exactly the same.

For example, the access token for a client `app` of a user with `reader` role should have the following `resource_access` section:
```json
{
"resource_access": {
"app": {
"roles": [
"reader"
]
},
}
```

An example of OIDC authorization configuration is the following:
```yaml
project: my-project
auth:
type: oidc
client_id: _CLIENT_ID__
client_secret: _CLIENT_SECRET__
realm: _REALM__
auth_server_url: _OIDC_SERVER_URL_
auth_discovery_url: _OIDC_SERVER_URL_/realms/master/.well-known/openid-configuration
...
```

In case of client configuration, the following settings must be added to specify the current user:
```yaml
auth:
...
username: _USERNAME_
password: _PASSWORD_
```

### Kubernetes RBAC Authorization
With Kubernetes RBAC Authorization, the client uses the service account token as the authorizarion bearer token, and the
server fetches the associated roles from the Kubernetes RBAC resources.

An example of Kubernetes RBAC authorization configuration is the following:
{% hint style="info" %}
**NOTE**: This configuration will only work if you deploy feast on Openshift or a Kubernetes platform.
{% endhint %}
```yaml
project: my-project
auth:
type: kubernetes
...
```

In case the client cannot run on the same cluster as the servers, the client token can be injected using the `LOCAL_K8S_TOKEN`
environment variable on the client side. The value must refer to the token of a service account created on the servers cluster
and linked to the desired RBAC roles.
1 change: 1 addition & 0 deletions docs/getting-started/components/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,4 @@ A complete Feast deployment contains the following components:
* **Batch Materialization Engine:** The [Batch Materialization Engine](batch-materialization-engine.md) component launches a process which loads data into the online store from the offline store. By default, Feast uses a local in-process engine implementation to materialize data. However, additional infrastructure can be used for a more scalable materialization process.
* **Online Store:** The online store is a database that stores only the latest feature values for each entity. The online store is either populated through materialization jobs or through [stream ingestion](../../reference/data-sources/push.md).
* **Offline Store:** The offline store persists batch data that has been ingested into Feast. This data is used for producing training datasets. For feature retrieval and materialization, Feast does not manage the offline store directly, but runs queries against it. However, offline stores can be configured to support writes if Feast configures logging functionality of served features.
* **Authorization Manager**: The authorization manager detects authentication tokens from client requests to Feast servers and uses this information to enforce permission policies on the requested services.
4 changes: 4 additions & 0 deletions docs/getting-started/concepts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,7 @@
{% content-ref url="dataset.md" %}
[dataset.md](dataset.md)
{% endcontent-ref %}

{% content-ref url="permission.md" %}
[permission.md](permission.md)
{% endcontent-ref %}
112 changes: 112 additions & 0 deletions docs/getting-started/concepts/permission.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# Permission

## Overview

The Feast permissions model allows to configure granular permission policies to all the resources defined in a feature store.

The configured permissions are stored in the Feast registry and accessible through the CLI and the registry APIs.

The permission authorization enforcement is performed when requests are executed through one of the Feast (Python) servers
- The online feature server (REST)
- The offline feature server (Arrow Flight)
- The registry server (gRPC)

Note that there is no permission enforcement when accessing the Feast API with a local provider.

## Concepts

The permission model is based on the following components:
- A `resource` is a Feast object that we want to secure against unauthorized access.
- We assume that the resource has a `name` attribute and optional dictionary of associated key-value `tags`.
- An `action` is a logical operation executed on the secured resource, like:
- `create`: Create an instance.
- `describe`: Access the instance state.
- `update`: Update the instance state.
- `delete`: Delete an instance.
- `read`: Read both online and offline stores.
- `read_online`: Read the online store.
- `read_offline`: Read the offline store.
- `write`: Write on any store.
- `write_online`: Write to the online store.
- `write_offline`: Write to the offline store.
- A `policy` identifies the rule for enforcing authorization decisions on secured resources, based on the current user.
- A default implementation is provided for role-based policies, using the user roles to grant or deny access to the requested actions
on the secured resources.

The `Permission` class identifies a single permission configured on the feature store and is identified by these attributes:
- `name`: The permission name.
- `types`: The list of protected resource types. Defaults to all managed types, e.g. the `ALL_RESOURCE_TYPES` alias. All sub-classes are included in the resource match.
- `name_pattern`: A regex to match the resource name. Defaults to `None`, meaning that no name filtering is applied
- `required_tags`: Dictionary of key-value pairs that must match the resource tags. Defaults to `None`, meaning that no tags filtering is applied.
- `actions`: The actions authorized by this permission. Defaults to `ALL_VALUES`, an alias defined in the `action` module.
- `policy`: The policy to be applied to validate a client request.

To simplify configuration, several constants are defined to streamline the permissions setup:
- In module `feast.feast_object`:
- `ALL_RESOURCE_TYPES` is the list of all the `FeastObject` types.
- `ALL_FEATURE_VIEW_TYPES` is the list of all the feature view types, including those not inheriting from `FeatureView` type like
`OnDemandFeatureView`.
- In module `feast.permissions.action`:
- `ALL_ACTIONS` is the list of all managed actions.
- `READ` includes all the read actions for online and offline store.
- `WRITE` includes all the write actions for online and offline store.
- `CRUD` includes all the state management actions to create, describe, update or delete a Feast resource.

Given the above definitions, the feature store can be configured with granular control over each resource, enabling partitioned access by
teams to meet organizational requirements for service and data sharing, and protection of sensitive information.

The `feast` CLI includes a new `permissions` command to list the registered permissions, with options to identify the matching resources for each configured permission and the existing resources that are not covered by any permission.

{% hint style="info" %}
**Note**: Feast resources that do not match any of the configured permissions are not secured by any authorization policy, meaning any user can execute any action on such resources.
{% endhint %}

## Definition examples
This permission definition grants access to the resource state and the ability to read all of the stores for any feature view or
feature service to all users with the role `super-reader`:
```py
Permission(
name="feature-reader",
types=[FeatureView, FeatureService],
policy=RoleBasedPolicy(roles=["super-reader"]),
actions=[AuthzedAction.DESCRIBE, READ],
)
```

This example grants permission to write on all the data sources with `risk_level` tag set to `high` only to users with role `admin` or `data_team`:
```py
Permission(
name="ds-writer",
types=[DataSource],
required_tags={"risk_level": "high"},
policy=RoleBasedPolicy(roles=["admin", "data_team"]),
actions=[AuthzedAction.WRITE],
)
```

{% hint style="info" %}
**Note**: When using multiple roles in a role-based policy, the user must be granted at least one of the specified roles.
{% endhint %}


The following permission grants authorization to read the offline store of all the feature views including `risky` in the name, to users with role `trusted`:

```py
Permission(
name="reader",
types=[FeatureView],
name_pattern=".*risky.*",
policy=RoleBasedPolicy(roles=["trusted"]),
actions=[AuthzedAction.READ_OFFLINE],
)
```

## Authorization configuration
In order to leverage the permission functionality, the `auth` section is needed in the `feature_store.yaml` configuration.
Currently, Feast supports OIDC and Kubernetes RBAC authorization protocols.

The default configuration, if you don't specify the `auth` configuration section, is `no_auth`, indicating that no permission
enforcement is applied.

The `auth` section includes a `type` field specifying the actual authorization protocol, and protocol-specific fields that
are specified in [Authorization Manager](../components/authz_manager.md).
Loading
You are viewing a condensed version of this merge commit. You can view the full changes here.