Skip to content

Commit da3aacf

Browse files
committed
Merge branch 'aaronzuo/mcp_http_config' of https://github.com/Anarion-zuo/feast into aaronzuo/mcp_http_config
2 parents e804827 + 1999d34 commit da3aacf

File tree

136 files changed

+8366
-1416
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

136 files changed

+8366
-1416
lines changed

.secrets.baseline

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1156,7 +1156,7 @@
11561156
"filename": "infra/feast-operator/internal/controller/services/services.go",
11571157
"hashed_secret": "36dc326eb15c7bdd8d91a6b87905bcea20b637d1",
11581158
"is_verified": false,
1159-
"line_number": 176
1159+
"line_number": 179
11601160
}
11611161
],
11621162
"infra/feast-operator/internal/controller/services/tls_test.go": [
@@ -1539,5 +1539,5 @@
15391539
}
15401540
]
15411541
},
1542-
"generated_at": "2026-03-18T08:09:25Z"
1542+
"generated_at": "2026-03-18T13:51:43Z"
15431543
}

docs/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -166,6 +166,7 @@
166166
* [\[Alpha\] Vector Database](reference/alpha-vector-database.md)
167167
* [\[Alpha\] Data quality monitoring](reference/dqm.md)
168168
* [\[Alpha\] Streaming feature computation with Denormalized](reference/denormalized.md)
169+
* [\[Alpha\] Feature View Versioning](reference/alpha-feature-view-versioning.md)
169170
* [OpenLineage Integration](reference/openlineage.md)
170171
* [Feast CLI reference](reference/feast-cli-commands.md)
171172
* [Python API reference](http://rtd.feast.dev)

docs/getting-started/concepts/feature-retrieval.md

Lines changed: 51 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -78,15 +78,19 @@ feature_store.get_historical_features(features=feature_service, entity_df=entity
7878

7979
This mechanism of retrieving features is only recommended as you're experimenting. Once you want to launch experiments or serve models, feature services are recommended.
8080

81-
Feature references uniquely identify feature values in Feast. The structure of a feature reference in string form is as follows: `<feature_view>:<feature>`
81+
Feature references uniquely identify feature values in Feast. The structure of a feature reference in string form is as follows: `<feature_view>[@version]:<feature>`
82+
83+
The `@version` part is optional. When omitted, the latest (active) version is used. You can specify a version like `@v2` to read from a specific historical version snapshot.
8284

8385
Feature references are used for the retrieval of features from Feast:
8486

8587
```python
8688
online_features = fs.get_online_features(
8789
features=[
88-
'driver_locations:lon',
89-
'drivers_activity:trips_today'
90+
'driver_locations:lon', # latest version (default)
91+
'drivers_activity:trips_today', # latest version (default)
92+
'drivers_activity@v2:trips_today', # specific version
93+
'drivers_activity@latest:trips_today', # explicit latest
9094
],
9195
entity_rows=[
9296
# {join_key: entity_value}
@@ -95,6 +99,10 @@ online_features = fs.get_online_features(
9599
)
96100
```
97101

102+
{% hint style="info" %}
103+
Version-qualified reads (`@v<N>`) require `enable_online_feature_view_versioning: true` in your registry config and are currently supported only on the SQLite online store. See the [feature view versioning docs](feature-view.md#version-qualified-feature-references) for details.
104+
{% endhint %}
105+
98106
It is possible to retrieve features from multiple feature views with a single request, and Feast is able to join features from multiple tables in order to build a training dataset. However, it is not possible to reference (or retrieve) features from multiple projects at the same time.
99107

100108
{% hint style="info" %}
@@ -107,6 +115,46 @@ The timestamp on which an event occurred, as found in a feature view's data sour
107115

108116
Event timestamps are used during point-in-time joins to ensure that the latest feature values are joined from feature views onto entity rows. Event timestamps are also used to ensure that old feature values aren't served to models during online serving.
109117

118+
#### Why `event_timestamp` is required in the entity dataframe
119+
120+
When calling `get_historical_features()`, the `entity_df` must include an `event_timestamp` column. This timestamp acts as the **upper bound (inclusive)** for which feature values are allowed to be retrieved for each entity row. Feast performs a point-in-time join (also called a "last known good value" temporal join): for each entity row, it retrieves the latest feature values with a timestamp **at or before** the entity row's `event_timestamp`.
121+
122+
This ensures **point-in-time correctness**, which is critical to prevent **data leakage** during model training. Without this constraint, features generated *after* the prediction time could leak into training data—effectively letting the model "see the future"—leading to inflated offline metrics that do not translate to real-world performance.
123+
124+
For example, if you want to predict whether a driver will be rated well on April 12 at 10:00 AM, the entity dataframe row should have `event_timestamp = datetime(2021, 4, 12, 10, 0, 0)`. Feast will then only join feature values observed on or before that time, excluding any data generated after 10:00 AM.
125+
126+
#### Retrieving features without an entity dataframe
127+
128+
While the entity dataframe is the standard way to retrieve historical features, Feast also supports **entity-less historical feature retrieval** by datetime range. This is useful when:
129+
130+
- You are training **time-series or population-level models** and don't have a pre-defined list of entity IDs.
131+
- You want **all features in a time window** for exploratory analysis or batch training on full history.
132+
- Constructing an entity dataframe upfront is unnecessarily complex or expensive.
133+
134+
Instead of passing `entity_df`, you specify a time window with `start_date` and/or `end_date`:
135+
136+
```python
137+
from datetime import datetime
138+
139+
training_df = store.get_historical_features(
140+
features=[
141+
"driver_hourly_stats:conv_rate",
142+
"driver_hourly_stats:acc_rate",
143+
"driver_hourly_stats:avg_daily_trips",
144+
],
145+
start_date=datetime(2025, 7, 1),
146+
end_date=datetime(2025, 7, 2),
147+
).to_df()
148+
```
149+
150+
If `start_date` is omitted, it defaults to `end_date` minus the feature view TTL. If `end_date` is omitted, it defaults to the current time. Point-in-time correctness is still preserved.
151+
152+
{% hint style="warning" %}
153+
Entity-less retrieval is currently supported for the **Postgres**, **Dask**, **Spark**, and **Ray** offline stores. You cannot mix `entity_df` with `start_date`/`end_date` in the same call.
154+
{% endhint %}
155+
156+
For more details, see the [FAQ](../faq.md#how-do-i-run-get_historical_features-without-providing-an-entity-dataframe) and [this blog post on entity-less historical feature retrieval](https://feast.dev/blog/entity-less-historical-features-retrieval/).
157+
110158
### Dataset
111159

112160
A dataset is a collection of rows that is produced by a historical retrieval from Feast in order to train a model. A dataset is produced by a join from one or more feature views onto an entity dataframe. Therefore, a dataset may consist of features from multiple feature views.

docs/getting-started/concepts/feature-view.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -160,6 +160,25 @@ Feature names must be unique within a [feature view](feature-view.md#feature-vie
160160

161161
Each field can have additional metadata associated with it, specified as key-value [tags](https://rtd.feast.dev/en/master/feast.html#feast.field.Field).
162162

163+
## \[Alpha\] Versioning
164+
165+
Feature views support automatic version tracking. Every time `feast apply` detects a schema or UDF change, a versioned snapshot is saved to the registry. This enables auditing what changed, reverting to a prior version, querying specific versions via `@v<N>` syntax, and staging new versions without promoting them.
166+
167+
Version history tracking is **always active** with no configuration needed. The `version` parameter is fully optional — omitting it preserves existing behavior.
168+
169+
```python
170+
# Pin to a specific version (reverts the active definition to v2's snapshot)
171+
driver_stats = FeatureView(
172+
name="driver_stats",
173+
entities=[driver],
174+
schema=[...],
175+
source=my_source,
176+
version="v2",
177+
)
178+
```
179+
180+
For full details on version pinning, version-qualified reads, staged publishing (`--no-promote`), online store support, and known limitations, see the **[\[Alpha\] Feature View Versioning](../../reference/alpha-feature-view-versioning.md)** reference page.
181+
163182
## Schema Validation
164183

165184
Feature views support an optional `enable_validation` parameter that enables schema validation during materialization and historical feature retrieval. When enabled, Feast verifies that:

docs/getting-started/quickstart.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -370,6 +370,9 @@ entity_df = pd.DataFrame.from_dict(
370370
# entity's join key -> entity values
371371
"driver_id": [1001, 1002, 1003],
372372
# "event_timestamp" (reserved key) -> timestamps
373+
# Each timestamp acts as the upper bound for the point-in-time join:
374+
# Feast retrieves the latest feature values at or before this time,
375+
# preventing data leakage from future events.
373376
"event_timestamp": [
374377
datetime(2021, 4, 12, 10, 59, 42),
375378
datetime(2021, 4, 12, 8, 12, 10),
@@ -498,7 +501,7 @@ print(training_df.head())
498501
{% endtabs %}
499502
### Step 6: Ingest batch features into your online store
500503

501-
We now serialize the latest values of features since the beginning of time to prepare for serving. Note, `materialize_incremental` serializes all new features since the last `materialize` call, or since the time provided minus the `ttl` timedelta. In this case, this will be `CURRENT_TIME - 1 day` (`ttl` was set on the `FeatureView` instances in [feature_repo/feature_repo/feature_definitions.py](feature_repo/feature_repo/feature_definitions.py)).
504+
We now serialize the latest values of features since the beginning of time to prepare for serving. Note, `materialize_incremental` serializes all new features since the last `materialize` call, or since the time provided minus the `ttl` timedelta. In this case, this will be `CURRENT_TIME - 1 day` (`ttl` was set on the `FeatureView` instances in `feature_definitions.py`).
502505

503506
{% tabs %}
504507
{% tab title="Bash (with timestamp)" %}

docs/how-to-guides/feast-snowflake-gcp-aws/build-a-training-dataset.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ feature_refs = [
2020
"driver_trips:maximum_daily_rides",
2121
"driver_trips:rating",
2222
"driver_trips:rating:trip_completed",
23+
# Optionally, reference a specific version: "driver_trips@v2:average_daily_rides"
2324
]
2425
```
2526

docs/how-to-guides/feast-snowflake-gcp-aws/read-features-from-the-online-store.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,9 @@ Create a list of features that you would like to retrieve. This list typically c
2121
```python
2222
features = [
2323
"driver_hourly_stats:conv_rate",
24-
"driver_hourly_stats:acc_rate"
24+
"driver_hourly_stats:acc_rate",
25+
# Optionally, reference a specific version (requires enable_online_feature_view_versioning):
26+
# "driver_hourly_stats@v2:conv_rate"
2527
]
2628
```
2729

0 commit comments

Comments
 (0)