Skip to content

Commit d618be5

Browse files
committed
Add more conceptual descriptions in module 0
Signed-off-by: Danny Chiao <danny@tecton.ai>
1 parent 479faa8 commit d618be5

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

module_0/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ Let's quickly review some Feast concepts needed to build this ML platform / use
6767
| Feature view | We'll have various feature views corresponding to different logical groups of features and transformations from data sources keyed on entities. These can be shared / re-used by data scientists and engineers and are registered with `feast apply`. <br><br/> Feast also supports reusable last mile transformations with `OnDemandFeatureView`s. We explore this in [Module 2](../module_2/README.md) |
6868
| Feature service | We build different model versions with different sets of features using feature services (`model_v1`, `model_v2`). Feature services group features a given model version depends on. It allows retrieving all necessary model features by using a feature service name. |
6969
| Registry | Where Feast stores registered features, data sources, entities, feature services and metadata. Users + model servers will pull from this to get the latest registered features + metadata |
70-
| Provider | We use the AWS provider here. A provider is a customizable interface that Feast uses to orchestrate feature generation / retrieval. <br></br>In `feature_store.yaml`, the main way to configure a Feast project, specifying a built-in provider (e.g. `aws`) ensures your registry can be stored in S3 (and also specifies default offline / online stores) |
70+
| Provider | We use the AWS provider here. A provider is a customizable interface that Feast uses to orchestrate feature generation / retrieval. <br></br>Specifying a built-in provider (e.g. `aws`) ensures your registry can be stored in S3 (and also specifies default offline / online stores) |
7171
| Offline store | The compute that Feast will use to execute point in time joins. Here we use `file` |
7272
| Online store | The low-latency storage Feast can materialize offline feature values to power online inference. In this module, we do not need one. |
7373
## A quick primer on feature views
@@ -99,10 +99,10 @@ They represent a group of features that should be physically colocated (e.g. in
9999
It's worth noting that there are multiple types of feature views. `OnDemandFeatureView`s for example enable row-level transformations on data sources and request data, with the output features described in the `schema` parameter.
100100

101101
# User groups
102-
There are three user groups here worth considering. The ML platform team, the ML engineers running batch inference on models, and the data scientists building the model.
102+
There are three user groups here worth considering. The ML platform team, the ML engineers running batch inference on models, and the data scientists building models.
103103

104104
## User group 1: ML Platform Team
105-
The team here sets up the centralized Feast feature repository in GitHub. This is what's seen in `feature_repo_aws/`.
105+
The team here sets up the centralized Feast feature repository and CI/CD in GitHub. This is what's seen in `feature_repo_aws/`.
106106

107107
### Step 0: Setup S3 bucket for registry and file sources
108108
This assumes you have an AWS account & Terraform setup. If you don't:
@@ -135,7 +135,7 @@ project_name = "danny"
135135
```
136136

137137
### Step 1: Setup the feature repo
138-
The first thing a platform team needs to do is setup the `feature_store.yaml` within a version controlled repo like GitHub. We've setup a sample feature repository in `feature_repo_aws/`
138+
The first thing a platform team needs to do is setup a `feature_store.yaml` file within a version controlled repo like GitHub. `feature_store.yaml` is the primary way to configure an overall Feast project. We've setup a sample feature repository in `feature_repo_aws/`
139139

140140
#### Step 1a: Use your configured S3 bucket
141141
There are two files in `feature_repo_aws` you need to change to point to your S3 bucket:
@@ -425,9 +425,9 @@ Data scientists or ML engineers can use the defined `FeatureService` (correspond
425425

426426
### Step 0: Understanding `get_historical_features` and feature services
427427

428-
`get_historical_features` is the API by which you can retrieve features (by referencing features directly or via feature services). It will under the hood manage point-in-time joins and avoid data leakage to generate training datasets or power batch scoring.
428+
`get_historical_features` is an API by which you can retrieve features (by referencing features directly or via feature services). It will under the hood manage point-in-time joins and avoid data leakage to generate training datasets or power batch scoring.
429429

430-
For batch scoring, you want to get the latest feature values for your entities. Feast right now requires timestamps in `get_historical_features`, so what you'll need to do is append an event timestamp of `now()`. e.g.
430+
For batch scoring, you want to get the latest feature values for your entities. Feast requires timestamps in `get_historical_features`, so what you'll need to do is append an event timestamp of `now()`. e.g.
431431

432432
```python
433433
# Get the latest feature values for unique entities

0 commit comments

Comments
 (0)