Skip to content

Commit 85ee789

Browse files
authored
feat: Refactor feature server helm charts to allow passing feature_store.yaml in environment variables (#3113)
* feat: Refactor feature server helm charts to allow passing feature_store.yaml in environment variables Signed-off-by: Danny Chiao <danny@tecton.ai> * lint Signed-off-by: Danny Chiao <danny@tecton.ai> * lint Signed-off-by: Danny Chiao <danny@tecton.ai> * add docs Signed-off-by: Danny Chiao <danny@tecton.ai> * lint Signed-off-by: Danny Chiao <danny@tecton.ai> * revert bad helm docs Signed-off-by: Danny Chiao <danny@tecton.ai> * add to bump files Signed-off-by: Danny Chiao <danny@tecton.ai> * add to bump files Signed-off-by: Danny Chiao <danny@tecton.ai> * add to release Signed-off-by: Danny Chiao <danny@tecton.ai> * fix readme for feast-feature-server Signed-off-by: Danny Chiao <danny@tecton.ai> Signed-off-by: Danny Chiao <danny@tecton.ai>
1 parent fad45ca commit 85ee789

File tree

24 files changed

+393
-100
lines changed

24 files changed

+393
-100
lines changed

.github/workflows/publish.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ jobs:
4949
needs: get-version
5050
strategy:
5151
matrix:
52-
component: [feature-server-python-aws, feature-server-java, feature-transformation-server]
52+
component: [feature-server-python, feature-server-python-aws, feature-server-java, feature-transformation-server]
5353
env:
5454
MAVEN_CACHE: gs://feast-templocation-kf-feast/.m2.2020-08-19.tar
5555
REGISTRY: feastdev

CONTRIBUTING.md

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,9 @@
3333
- [(Contrib) Running tests for HBase online store](#contrib-running-tests-for-hbase-online-store)
3434
- [(Experimental) Feast UI](#experimental-feast-ui)
3535
- [Feast Java Serving](#feast-java-serving)
36+
- [Developing the Feast Helm charts](#developing-the-feast-helm-charts)
37+
- [Feast Java Feature Server Helm Chart](#feast-java-feature-server-helm-chart)
38+
- [Feast Python / Go Feature Server Helm Chart](#feast-python--go-feature-server-helm-chart)
3639
- [Feast Go Client](#feast-go-client)
3740
- [Environment Setup](#environment-setup-1)
3841
- [Building](#building)
@@ -197,7 +200,6 @@ To test across clouds, on top of setting up Redis, you also need GCP / AWS / Sno
197200
> and commenting out tests that are added to `DEFAULT_FULL_REPO_CONFIGS`
198201
199202
**GCP**
200-
### Setup your GCP BigQuery Instance
201203
1. You can get free credits [here](https://cloud.google.com/free/docs/free-cloud-features#free-trial).
202204
2. You will need to setup a service account, enable the BigQuery API, and create a staging location for a bucket.
203205
* Setup your service account and project using steps 1-5 [here](https://codelabs.developers.google.com/codelabs/cloud-bigquery-python#0).
@@ -347,6 +349,35 @@ See [Feast contributing guide](ui/CONTRIBUTING.md)
347349
## Feast Java Serving
348350
See [Java contributing guide](java/CONTRIBUTING.md)
349351

352+
See also development instructions related to the helm chart below at [Developing the Feast Helm charts](#developing-the-feast-helm-charts)
353+
354+
## Developing the Feast Helm charts
355+
There are 3 helm charts:
356+
- Feast Java feature server
357+
- Feast Python / Go feature server
358+
- (deprecated) Feast Python feature server
359+
360+
Generally, you can override the images in the helm charts with locally built Docker images, and install the local helm
361+
chart.
362+
363+
All README's for helm charts are generated using [helm-docs](https://github.com/norwoodj/helm-docs). You can install it
364+
(e.g. with `brew install norwoodj/tap/helm-docs`) and then run `make build-helm-docs`.
365+
366+
### Feast Java Feature Server Helm Chart
367+
See the Java demo example (it has development instructions too using minikube) [here](examples/java-demo/README.md)
368+
369+
It will:
370+
- run `make build-java-docker-dev` to build local Java feature server binaries
371+
- configure the included `application-override.yaml` to override the image tag to use the locally built dev images.
372+
- install the local chart with `helm install feast-release ../../../infra/charts/feast --values application-override.yaml`
373+
374+
### Feast Python / Go Feature Server Helm Chart
375+
See the Python demo example (it has development instructions too using minikube) [here](examples/python-helm-demo/README.md)
376+
377+
It will:
378+
- run `make build-feature-server-dev` to build a local python feature server binary
379+
- install the local chart with `helm install feast-release ../../../infra/charts/feast-feature-server --set image.tag=dev --set feature_store_yaml_base64=$(base64 feature_store.yaml)`
380+
350381
## Feast Go Client
351382
### Environment Setup
352383
Setting up your development environment for Feast Go SDK:

Makefile

Lines changed: 25 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -353,7 +353,7 @@ lint-go: compile-protos-go compile-go-lib
353353

354354
# Docker
355355

356-
build-docker: build-ci-docker build-feature-server-python-aws-docker build-feature-transformation-server-docker build-feature-server-java-docker
356+
build-docker: build-ci-docker build-feature-server-python-docker build-feature-server-python-aws-docker build-feature-transformation-server-docker build-feature-server-java-docker
357357

358358
push-ci-docker:
359359
docker push $(REGISTRY)/feast-ci:$(VERSION)
@@ -362,13 +362,21 @@ push-ci-docker:
362362
build-ci-docker:
363363
docker buildx build -t $(REGISTRY)/feast-ci:$(VERSION) -f infra/docker/ci/Dockerfile --load .
364364

365+
push-feature-server-python-docker:
366+
docker push $(REGISTRY)/feature-server:$$VERSION
367+
368+
build-feature-server-python-docker:
369+
docker buildx build --build-arg VERSION=$$VERSION \
370+
-t $(REGISTRY)/feature-server:$$VERSION \
371+
-f sdk/python/feast/infra/feature_servers/multicloud/Dockerfile --load .
372+
365373
push-feature-server-python-aws-docker:
366-
docker push $(REGISTRY)/feature-server-python-aws:$$VERSION
374+
docker push $(REGISTRY)/feature-server-python-aws:$$VERSION
367375

368376
build-feature-server-python-aws-docker:
369-
docker buildx build --build-arg VERSION=$$VERSION \
370-
-t $(REGISTRY)/feature-server-python-aws:$$VERSION \
371-
-f sdk/python/feast/infra/feature_servers/aws_lambda/Dockerfile --load .
377+
docker buildx build --build-arg VERSION=$$VERSION \
378+
-t $(REGISTRY)/feature-server-python-aws:$$VERSION \
379+
-f sdk/python/feast/infra/feature_servers/aws_lambda/Dockerfile --load .
372380

373381
push-feature-transformation-server-docker:
374382
docker push $(REGISTRY)/feature-transformation-server:$(VERSION)
@@ -386,6 +394,13 @@ build-feature-server-java-docker:
386394
-t $(REGISTRY)/feature-server-java:$(VERSION) \
387395
-f java/infra/docker/feature-server/Dockerfile --load .
388396

397+
# Dev images
398+
399+
build-feature-server-dev:
400+
docker buildx build --build-arg VERSION=dev \
401+
-t feastdev/feature-server:dev \
402+
-f sdk/python/feast/infra/feature_servers/multicloud/Dockerfile.dev --load .
403+
389404
build-java-docker-dev:
390405
make build-java-no-tests REVISION=dev
391406
docker buildx build --build-arg VERSION=dev \
@@ -425,6 +440,11 @@ build-sphinx: compile-protos-python
425440
build-templates:
426441
python infra/scripts/compile-templates.py
427442

443+
build-helm-docs:
444+
cd ${ROOT_DIR}/infra/charts/feast; helm-docs
445+
cd ${ROOT_DIR}/infra/charts/feast-feature-server; helm-docs
446+
cd ${ROOT_DIR}/infra/charts/feast-python-server; helm-docs
447+
428448
# Web UI
429449

430450
# Note: requires node and yarn to be installed

docs/getting-started/concepts/feature-retrieval.md

Lines changed: 40 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,9 @@ Generally, Feast supports several patterns of feature retrieval:
66

77
1. Training data generation (via `feature_store.get_historical_features(...)`)
88
2. Offline feature retrieval for batch scoring (via `feature_store.get_historical_features(...)`)
9-
3. Online feature retrieval for real-time model predictions (via `feature_store.get_online_features(...)`)
9+
3. Online feature retrieval for real-time model predictions
10+
- via the SDK: `feature_store.get_online_features(...)`
11+
- via deployed feature server endpoints: `requests.post('http://localhost:6566/get-online-features', data=json.dumps(online_request))`
1012

1113
Each of these retrieval mechanisms accept:
1214

@@ -100,7 +102,6 @@ batch_scoring_features = store.get_historical_features(
100102

101103
```python
102104
from feast import FeatureStore
103-
import pandas as pd
104105

105106
store = FeatureStore(repo_path=".")
106107

@@ -124,13 +125,23 @@ batch_scoring_features = store.get_historical_features(
124125

125126
<details>
126127

127-
<summary>How to: retrieve online features for real-time model inference</summary>
128+
<summary>How to: retrieve online features for real-time model inference (Python SDK)</summary>
128129

129130
Feast will ensure the latest feature values for registered features are available. At retrieval time, you need to supply a list of **entities** and the corresponding **features** to be retrieved. Similar to `get_historical_features`, we recommend using feature services as a mechanism for grouping features in a model version.
130131

131132
_Note: unlike `get_historical_features`, the `entity_rows` **do not need timestamps** since you only want one feature value per entity key._
132133

133134
```python
135+
from feast import RepoConfig, FeatureStore
136+
from feast.repo_config import RegistryConfig
137+
138+
repo_config = RepoConfig(
139+
registry=RegistryConfig(path="gs://feast-test-gcs-bucket/registry.pb"),
140+
project="feast_demo_gcp",
141+
provider="gcp",
142+
)
143+
store = FeatureStore(config=repo_config)
144+
134145
features = store.get_online_features(
135146
features=[
136147
"driver_hourly_stats:conv_rate",
@@ -147,6 +158,32 @@ features = store.get_online_features(
147158

148159
</details>
149160

161+
<details>
162+
163+
<summary>How to: retrieve online features for real-time model inference (Feature Server)</summary>
164+
165+
Feast will ensure the latest feature values for registered features are available. At retrieval time, you need to supply a list of **entities** and the corresponding **features** to be retrieved. Similar to `get_historical_features`, we recommend using feature services as a mechanism for grouping features in a model version.
166+
167+
_Note: unlike `get_historical_features`, the `entity_rows` **do not need timestamps** since you only want one feature value per entity key._
168+
169+
This approach requires you to deploy a feature server (see [Python feature server](../../reference/feature-servers/python-feature-server)).
170+
171+
```python
172+
import requests
173+
import json
174+
175+
online_request = {
176+
"features": [
177+
"driver_hourly_stats:conv_rate",
178+
],
179+
"entities": {"driver_id": [1001, 1002]},
180+
}
181+
r = requests.post('http://localhost:6566/get-online-features', data=json.dumps(online_request))
182+
print(json.dumps(r.json(), indent=4, sort_keys=True))
183+
```
184+
185+
</details>
186+
150187
## Feature Services
151188

152189
A feature service is an object that represents a logical group of features from one or more [feature views](feature-view.md#feature-view). Feature Services allows features from within a feature view to be used as needed by an ML model. Users can expect to create one feature service per model version, allowing for tracking of the features used by models.

examples/java-demo/README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,18 +30,21 @@ For this tutorial, we setup Feast with Redis, using the Feast CLI to register an
3030
2. Make a bucket in GCS (or S3)
3131
3. The feature repo is already setup here, so you just need to swap in your GCS bucket and Redis credentials.
3232
We need to modify the `feature_store.yaml`, which has two fields for you to replace:
33-
```yaml
34-
registry: gs://[YOUR BUCKET]/demo-repo/registry.db
33+
```yaml
34+
registry: gs://[YOUR GCS BUCKET]/demo-repo/registry.db
3535
project: feast_java_demo
3636
provider: gcp
3737
online_store:
3838
type: redis
39+
# Note: this would normally be using instance URL's to access Redis
3940
connection_string: localhost:6379,password=[YOUR PASSWORD]
4041
offline_store:
4142
type: file
43+
entity_key_serialization_version: 2
4244
```
4345
4. Run `feast apply` to apply your local features to the remote registry
44-
5. Materialize features to the online store:
46+
- Note: you may need to authenticate to gcloud first with `gcloud auth login`
47+
6. Materialize features to the online store:
4548
```bash
4649
CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
4750
feast materialize-incremental $CURRENT_TIME

examples/java-demo/feature_repo/feature_store.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ project: feast_java_demo
33
provider: gcp
44
online_store:
55
type: redis
6+
# Note: this would normally be using instance URL's to access Redis
67
connection_string: localhost:6379,password=[YOUR PASSWORD]
78
offline_store:
89
type: file
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
2+
# Running Feast Python / Go Feature Server with Redis on Kubernetes
3+
4+
For this tutorial, we set up Feast with Redis.
5+
6+
We use the Feast CLI to register and materialize features, and then retrieving via a Feast Python feature server deployed in Kubernetes
7+
8+
## First, let's set up a Redis cluster
9+
1. Start minikube (`minikube start`)
10+
2. Use helm to install a default Redis cluster
11+
```bash
12+
helm repo add bitnami https://charts.bitnami.com/bitnami
13+
helm repo update
14+
helm install my-redis bitnami/redis
15+
```
16+
![](redis-screenshot.png)
17+
3. Port forward Redis so we can materialize features to it
18+
19+
```bash
20+
kubectl port-forward --namespace default svc/my-redis-master 6379:6379
21+
```
22+
4. Get your Redis password using the command (pasted below for convenience). We'll need this to tell Feast how to communicate with the cluster.
23+
24+
```bash
25+
export REDIS_PASSWORD=$(kubectl get secret --namespace default my-redis -o jsonpath="{.data.redis-password}" | base64 --decode)
26+
echo $REDIS_PASSWORD
27+
```
28+
29+
## Next, we setup a local Feast repo
30+
1. Install Feast with Redis dependencies `pip install "feast[redis]"`
31+
2. Make a bucket in GCS (or S3)
32+
3. The feature repo is already setup here, so you just need to swap in your GCS bucket and Redis credentials.
33+
We need to modify the `feature_store.yaml`, which has two fields for you to replace:
34+
```yaml
35+
registry: gs://[YOUR GCS BUCKET]/demo-repo/registry.db
36+
project: feast_python_demo
37+
provider: gcp
38+
online_store:
39+
type: redis
40+
# Note: this would normally be using instance URL's to access Redis
41+
connection_string: localhost:6379,password=[YOUR PASSWORD]
42+
offline_store:
43+
type: file
44+
entity_key_serialization_version: 2
45+
```
46+
4. Run `feast apply` from within the `feature_repo` directory to apply your local features to the remote registry
47+
- Note: you may need to authenticate to gcloud first with `gcloud auth login`
48+
5. Materialize features to the online store:
49+
```bash
50+
CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
51+
feast materialize-incremental $CURRENT_TIME
52+
```
53+
54+
## Now let's setup the Feast Server
55+
1. Add the gcp-auth addon to mount GCP credentials:
56+
```bash
57+
minikube addons enable gcp-auth
58+
```
59+
2. Add Feast's Python/Go feature server chart repo
60+
```bash
61+
helm repo add feast-charts https://feast-helm-charts.storage.googleapis.com
62+
helm repo update
63+
```
64+
3. For this tutorial, because we don't have a direct hosted endpoint into Redis, we need to change `feature_store.yaml` to talk to the Kubernetes Redis service
65+
```bash
66+
sed -i '' 's/localhost:6379/my-redis-master:6379/g' feature_store.yaml
67+
```
68+
4. Install the Feast helm chart: `helm install feast-release feast-charts/feast-feature-server --set feature_store_yaml_base64=$(base64 feature_store.yaml)`
69+
> **Dev instructions**: if you're changing the java logic or chart, you can do
70+
1. `eval $(minikube docker-env)`
71+
2. `make build-feature-server-dev`
72+
3. `helm install feast-release ../../../infra/charts/feast-feature-server --set image.tag=dev --set feature_store_yaml_base64=$(base64 feature_store.yaml)`
73+
5. (Optional): check logs of the server to make sure it’s working
74+
```bash
75+
kubectl logs svc/feast-feature-server
76+
```
77+
6. Port forward to expose the grpc endpoint:
78+
```bash
79+
kubectl port-forward svc/feast-feature-server 6566:80
80+
```
81+
7. Run test fetches for online features:8.
82+
- First: change back the Redis connection string to allow localhost connections to Redis
83+
```bash
84+
sed -i '' 's/my-redis-master:6379/localhost:6379/g' feature_store.yaml
85+
```
86+
- Then run the included fetch script, which fetches both via the HTTP endpoint and for comparison, via the Python SDK
87+
```bash
88+
python test_python_fetch.py
89+
```

examples/python-helm-demo/feature_repo/__init__.py

Whitespace-only changes.
Binary file not shown.
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
from datetime import timedelta
2+
3+
import pandas as pd
4+
5+
from feast.data_source import RequestSource
6+
from feast.on_demand_feature_view import on_demand_feature_view
7+
from feast.types import Float32, Float64, Int64, String
8+
from feast.field import Field
9+
10+
from feast import Entity, FileSource, FeatureView
11+
12+
driver_hourly_stats = FileSource(
13+
path="data/driver_stats_with_string.parquet",
14+
timestamp_field="event_timestamp",
15+
created_timestamp_column="created",
16+
)
17+
driver = Entity(name="driver_id", description="driver id",)
18+
driver_hourly_stats_view = FeatureView(
19+
name="driver_hourly_stats",
20+
entities=[driver],
21+
ttl=timedelta(days=365),
22+
schema=[
23+
Field(name="conv_rate", dtype=Float32),
24+
Field(name="acc_rate", dtype=Float32),
25+
Field(name="avg_daily_trips", dtype=Int64),
26+
Field(name="string_feature", dtype=String),
27+
],
28+
online=True,
29+
source=driver_hourly_stats,
30+
tags={},
31+
)
32+
33+
# Define a request data source which encodes features / information only
34+
# available at request time (e.g. part of the user initiated HTTP request)
35+
input_request = RequestSource(
36+
name="vals_to_add",
37+
schema=[
38+
Field(name="val_to_add", dtype=Int64),
39+
Field(name="val_to_add_2", dtype=Int64),
40+
],
41+
)
42+
43+
44+
# Define an on demand feature view which can generate new features based on
45+
# existing feature views and RequestSource features
46+
@on_demand_feature_view(
47+
sources=[
48+
driver_hourly_stats_view,
49+
input_request,
50+
],
51+
schema=[
52+
Field(name="conv_rate_plus_val1", dtype=Float64),
53+
Field(name="conv_rate_plus_val2", dtype=Float64),
54+
],
55+
)
56+
def transformed_conv_rate(inputs: pd.DataFrame) -> pd.DataFrame:
57+
df = pd.DataFrame()
58+
df["conv_rate_plus_val1"] = inputs["conv_rate"] + inputs["val_to_add"]
59+
df["conv_rate_plus_val2"] = inputs["conv_rate"] + inputs["val_to_add_2"]
60+
return df
61+

0 commit comments

Comments
 (0)