You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/how-to-guides/adding-a-new-offline-store.md
+89-9Lines changed: 89 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
3
3
## Overview
4
4
5
-
Feast makes adding support for a new offline store (database) easy. Developers can simply implement the [OfflineStore](../../sdk/python/feast/infra/offline_stores/offline_store.py#L41) interface to add support for a new store (other than the existing stores like Parquet files, Redshift, and Bigquery).
5
+
Feast makes adding support for a new offline store (database) easy. Developers can simply implement the [OfflineStore](../../sdk/python/feast/infra/offline\_stores/offline\_store.py#L41) interface to add support for a new store (other than the existing stores like Parquet files, Redshift, and Bigquery). 
6
6
7
7
In this guide, we will show you how to extend the existing File offline store and use in a feature repo. While we will be implementing a specific store, this guide should be representative for adding support for any new offline store.
8
8
@@ -13,15 +13,16 @@ The process for using a custom offline store consists of 4 steps:
13
13
1. Defining an `OfflineStore` class.
14
14
2. Defining an `OfflineStoreConfig` class.
15
15
3. Defining a `RetrievalJob` class for this offline store.
16
-
4. Referencing the `OfflineStore` in a feature repo's `feature_store.yaml` file.
16
+
4. Defining a `DataSource` class for the offline store
17
+
5. Referencing the `OfflineStore` in a feature repo's `feature_store.yaml` file.
17
18
18
19
## 1. Defining an OfflineStore class
19
20
20
21
{% hint style="info" %}
21
-
OfflineStore class names must end with the OfflineStore suffix!
22
+
 OfflineStore class names must end with the OfflineStore suffix!
22
23
{% endhint %}
23
24
24
-
The OfflineStore class contains a couple of methods to read features from the offline store. Unlike the OnlineStore class, Feast does not manage any infrastructure for the offline store.
25
+
The OfflineStore class contains a couple of methods to read features from the offline store. Unlike the OnlineStore class, Feast does not manage any infrastructure for the offline store. 
25
26
26
27
There are two methods that deal with reading data from the offline stores`get_historical_features`and `pull_latest_from_table_or_query`.
27
28
@@ -72,11 +73,11 @@ There are two methods that deal with reading data from the offline stores`get_hi
72
73
73
74
Additional configuration may be needed to allow the OfflineStore to talk to the backing store. For example, Redshift needs configuration information like the connection information for the Redshift instance, credentials for connecting to the database, etc.
74
75
75
-
To facilitate configuration, all OfflineStore implementations are **required** to also define a corresponding OfflineStoreConfig class in the same file. This OfflineStoreConfig class should inherit from the `FeastConfigBaseModel` class, which is defined [here](../../sdk/python/feast/repo_config.py#L44).
76
+
To facilitate configuration, all OfflineStore implementations are **required** to also define a corresponding OfflineStoreConfig class in the same file. This OfflineStoreConfig class should inherit from the `FeastConfigBaseModel` class, which is defined [here](../../sdk/python/feast/repo\_config.py#L44). 
76
77
77
78
The `FeastConfigBaseModel` is a [pydantic](https://pydantic-docs.helpmanual.io) class, which parses yaml configuration into python objects. Pydantic also allows the model classes to define validators for the config classes, to make sure that the config classes are correctly defined.
78
79
79
-
This config class **must** container a `type` field, which contains the fully qualified class name of its corresponding OfflineStore class.
80
+
This config class **must** container a `type` field, which contains the fully qualified class name of its corresponding OfflineStore class. 
80
81
81
82
Additionally, the name of the config class must be the same as the OfflineStore class, with the `Config` suffix.
82
83
@@ -102,7 +103,7 @@ This configuration can be specified in the `feature_store.yaml` as follows:
102
103
```
103
104
{% endcode %}
104
105
105
-
This configuration information is available to the methods of the OfflineStore, via the`config: RepoConfig` parameter which is passed into the methods of the OfflineStore interface, specifically at the `config.offline_store` field of the `config` parameter.
106
+
This configuration information is available to the methods of the OfflineStore, via the`config: RepoConfig` parameter which is passed into the methods of the OfflineStore interface, specifically at the `config.offline_store` field of the `config` parameter. 
@@ -153,9 +154,69 @@ class CustomFileRetrievalJob(RetrievalJob):
153
154
```
154
155
{% endcode %}
155
156
156
-
## 4. Using the custom offline store
157
+
## 4. Defining a DataSource class for the offline store
157
158
158
-
After implementing these classes, the custom offline store can be used by referencing it in a feature repo's `feature_store.yaml` file, specifically in the `offline_store` field. The value specified should be the fully qualified class name of the OfflineStore.
159
+
Before this offline store can be used as the batch source for a feature view in a feature repo, a subclass of the `DataSource`[base class](https://rtd.feast.dev/en/master/index.html?highlight=DataSource#feast.data\_source.DataSource) needs to be defined. This class is responsible for holding information needed by specific feature views to support reading historical values from the offline store. For example, a feature view using Redshift as the offline store may need to know which table contains historical feature values.
160
+
161
+
The data source class should implement two methods - `from_proto`, and `to_proto`. 
162
+
163
+
For custom offline stores that are not being implemented in the main feature repo, the `custom_options` field should be used to store any configuration needed by the data source. In this case, the implementer is responsible for serializing this configuration into bytes in the `to_proto` method and reading the value back from bytes in the `from_proto` method.
After implementing these classes, the custom offline store can be used by referencing it in a feature repo's `feature_store.yaml` file, specifically in the `offline_store` field. The value specified should be the fully qualified class name of the OfflineStore. 
159
220
160
221
As long as your OfflineStore class is available in your Python environment, it will be imported by Feast dynamically at runtime.
0 commit comments