You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Feature view versioning (#6101) added @v<N> syntax for version-qualified online reads and feast materialize --version v<N> for writing to version-specific online tables. The offline retrieval path (get_historical_features) has no equivalent: version qualifiers in feature refs are silently stripped during ibis prefix matching but never used to route to a different source table.
This means a team that pins their online serving to driver_stats@v2 cannot generate training data from the same v2 schema — breaking training-serving consistency, which is one of the primary motivations for versioning.
The gap
Online store (works today):
store.get_online_features(features=["driver_stats@v2:trips_today"], ...)
# → routes to version-specific table: {project}_driver_stats_v2
Offline store (silently ignores version today):
store.get_historical_features(entity_df=..., features=["driver_stats@v2:trips_today"])
# → routes to the same source table regardless of @v2# → version tag stripped in _parse_feature_ref / ibis prefix matching
Design questions
I'd like to get your input on the right approach before writing any code, since there are a few viable directions with different tradeoffs:
Option A — Version-specific source tables (mirrors online store pattern)
When enable_online_feature_view_versioning is enabled and a versioned ref is used, route to a version-suffixed offline source table (e.g. {project}_driver_stats_v2). Requires that versioned offline tables exist — either pre-created by an upstream pipeline or written by a future feast materialize-offline --version v2.
Pros: mirrors online store semantics exactly, strong training-serving consistency guarantee.
Cons: offline source data is usually externally managed; teams would need to maintain versioned source tables alongside their online tables, which may be impractical.
Option B — Schema-level version resolution (read from current table, project v schema)
Keep reading from the same source table, but apply the schema of the pinned version when selecting columns. Useful when the underlying data hasn't changed but the feature view's field definitions have evolved.
Pros: no need to maintain separate offline tables, works with externally managed sources.
Cons: weaker guarantee — if the source data itself changed between versions, this doesn't help.
Option C — Raise VersionedOfflineReadNotSupported (explicit error instead of silent ignore)
At minimum, detect @v<N> refs in get_historical_features and raise a clear error rather than silently stripping the version tag. This prevents silent training-serving skew today while the longer-term design is decided.
If in scope, does Option A or B better match the intended semantics?
Would Option C (explicit error) be a useful interim step regardless?
Happy to take this forward once there's a design direction. I've been working through the Snowflake online store versioning (#6380) and the type_map offline fixes (#6388) so I have reasonable familiarity with both sides of the retrieval path.
Summary
Feature view versioning (#6101) added
@v<N>syntax for version-qualified online reads andfeast materialize --version v<N>for writing to version-specific online tables. The offline retrieval path (get_historical_features) has no equivalent: version qualifiers in feature refs are silently stripped during ibis prefix matching but never used to route to a different source table.This means a team that pins their online serving to
driver_stats@v2cannot generate training data from the same v2 schema — breaking training-serving consistency, which is one of the primary motivations for versioning.The gap
Online store (works today):
Offline store (silently ignores version today):
Design questions
I'd like to get your input on the right approach before writing any code, since there are a few viable directions with different tradeoffs:
Option A — Version-specific source tables (mirrors online store pattern)
When
enable_online_feature_view_versioningis enabled and a versioned ref is used, route to a version-suffixed offline source table (e.g.{project}_driver_stats_v2). Requires that versioned offline tables exist — either pre-created by an upstream pipeline or written by a futurefeast materialize-offline --version v2.Pros: mirrors online store semantics exactly, strong training-serving consistency guarantee.
Cons: offline source data is usually externally managed; teams would need to maintain versioned source tables alongside their online tables, which may be impractical.
Option B — Schema-level version resolution (read from current table, project v schema)
Keep reading from the same source table, but apply the schema of the pinned version when selecting columns. Useful when the underlying data hasn't changed but the feature view's field definitions have evolved.
Pros: no need to maintain separate offline tables, works with externally managed sources.
Cons: weaker guarantee — if the source data itself changed between versions, this doesn't help.
Option C — Raise
VersionedOfflineReadNotSupported(explicit error instead of silent ignore)At minimum, detect
@v<N>refs inget_historical_featuresand raise a clear error rather than silently stripping the version tag. This prevents silent training-serving skew today while the longer-term design is decided.Questions for you
Happy to take this forward once there's a design direction. I've been working through the Snowflake online store versioning (#6380) and the type_map offline fixes (#6388) so I have reasonable familiarity with both sides of the retrieval path.