Skip to content

Commit d9dd0fe

Browse files
soooojinleeclaude
andcommitted
refactor: Replace combinatorial nested collection enums with recursive VALUE_LIST/VALUE_SET
Replace 4 combinatorial enum values (LIST_LIST=36, LIST_SET=37, SET_LIST=38, SET_SET=39) with 2 recursive enum values (VALUE_LIST=40, VALUE_SET=41) that use RepeatedValue to enable unlimited nesting depth. This is a breaking change for an unreleased feature, as suggested in PR feast-dev#6132 review. Key changes: - Proto: Remove 4 enum/oneof fields, add VALUE_LIST/VALUE_SET with reserved 36-39 - Python: Update ValueType enum, type system, serialization, field persistence - JSON: Update proto_json encode/decode for new field names - Tests: Rewrite all nested collection tests (204 tests passing) - Docs: Update type-system.md for recursive design Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent c1ce5d8 commit d9dd0fe

17 files changed

Lines changed: 183 additions & 290 deletions

File tree

docs/reference/type-system.md

Lines changed: 14 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -82,21 +82,21 @@ All primitive types (except `Map` and `Json`) have corresponding set types for s
8282

8383
### Nested Collection Types
8484

85-
Feast supports 2-level nested collections, combining Array and Set types:
85+
Feast supports arbitrarily nested collections using a recursive `VALUE_LIST` / `VALUE_SET` design. The outer container determines the proto enum (`VALUE_LIST` for `Array(…)`, `VALUE_SET` for `Set(…)`), while the full inner type structure is persisted via a mandatory `feast:nested_inner_type` Field tag.
8686

8787
| Feast Type | Python Type | ValueType | Description |
8888
|------------|-------------|-----------|-------------|
89-
| `Array(Array(T))` | `List[List[T]]` | `LIST_LIST` | List of lists |
90-
| `Array(Set(T))` | `List[List[T]]` | `LIST_SET` | List of sets (inner elements deduplicated) |
91-
| `Set(Array(T))` | `List[List[T]]` | `SET_LIST` | Set of lists |
92-
| `Set(Set(T))` | `List[List[T]]` | `SET_SET` | Set of sets (inner elements deduplicated) |
89+
| `Array(Array(T))` | `List[List[T]]` | `VALUE_LIST` | List of lists |
90+
| `Array(Set(T))` | `List[List[T]]` | `VALUE_LIST` | List of sets |
91+
| `Set(Array(T))` | `List[List[T]]` | `VALUE_SET` | Set of lists |
92+
| `Set(Set(T))` | `List[List[T]]` | `VALUE_SET` | Set of sets |
93+
| `Array(Array(Array(T)))` | `List[List[List[T]]]` | `VALUE_LIST` | 3-level nesting |
9394

94-
Where `T` is any supported primitive type (Int32, Int64, Float32, Float64, String, Bytes, Bool, UnixTimestamp).
95+
Where `T` is any supported primitive type (Int32, Int64, Float32, Float64, String, Bytes, Bool, UnixTimestamp) or another nested collection type.
9596

9697
**Notes:**
97-
- Nesting is limited to 2 levels. `Array(Array(Array(T)))` will raise a `ValueError`.
98-
- Inner type information is preserved via Field tags (`feast:nested_inner_type`) and restored during deserialization.
99-
- For `Array(Set(T))` and `Set(Set(T))`, inner collection elements are automatically deduplicated.
98+
- Nesting depth is **unlimited**. `Array(Array(Array(T)))`, `Set(Array(Set(T)))`, etc. are all supported.
99+
- Inner type information is preserved via Field tags (`feast:nested_inner_type`) and restored during deserialization. This tag is mandatory for nested collection types.
100100
- Empty inner collections (`[]`) are stored as empty proto values and round-trip as `None`. For example, `[[1, 2], [], [3]]` becomes `[[1, 2], None, [3]]` after a write-read cycle.
101101

102102
### Map Types
@@ -275,23 +275,20 @@ tag_ids = set(tag_list) # {100, 200, 300}
275275

276276
### Nested Collection Type Usage Examples
277277

278-
Nested collections allow storing multi-dimensional data:
278+
Nested collections allow storing multi-dimensional data with unlimited depth:
279279

280280
```python
281281
# List of lists — e.g., weekly score history per user
282282
weekly_scores = [[85.0, 90.5, 78.0], [92.0, 88.5], [95.0, 91.0, 87.5]]
283283

284284
# List of sets — e.g., unique tags assigned per category
285285
unique_tags_per_category = [["python", "ml"], ["rust", "systems"], ["python", "web"]]
286-
# Inner sets are automatically deduplicated:
287-
# [["python", "ml"], ...] (duplicates within each inner set are removed)
288286

289-
# Set of lists — e.g., distinct ordered sequences observed
290-
distinct_sequences = [[1, 2, 3], [4, 5], [1, 2, 3]]
287+
# 3-level nesting — e.g., multi-dimensional matrices
288+
Field(name="tensor", dtype=Array(Array(Array(Float64))))
291289

292-
# Set of sets — e.g., distinct groups of unique items
293-
distinct_groups = [["a", "b"], ["c", "d"], ["a", "b"]]
294-
# Inner elements are deduplicated within each set
290+
# Mixed nesting
291+
Field(name="grouped_tags", dtype=Array(Set(Array(String))))
295292
```
296293

297294
**Limitation:** Empty inner collections round-trip as `None`:

protos/feast/types/Value.proto

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -57,10 +57,10 @@ message ValueType {
5757
JSON_LIST = 33;
5858
STRUCT = 34;
5959
STRUCT_LIST = 35;
60-
LIST_LIST = 36;
61-
LIST_SET = 37;
62-
SET_LIST = 38;
63-
SET_SET = 39;
60+
// 36-39 were LIST_LIST, LIST_SET, SET_LIST, SET_SET (removed, replaced by VALUE_LIST/VALUE_SET)
61+
reserved 36, 37, 38, 39;
62+
VALUE_LIST = 40;
63+
VALUE_SET = 41;
6464
}
6565
}
6666

@@ -100,10 +100,8 @@ message Value {
100100
StringList json_list_val = 33;
101101
Map struct_val = 34;
102102
MapList struct_list_val = 35;
103-
RepeatedValue list_list_val = 36;
104-
RepeatedValue list_set_val = 37;
105-
RepeatedValue set_list_val = 38;
106-
RepeatedValue set_set_val = 39;
103+
RepeatedValue list_val = 40;
104+
RepeatedValue set_val = 41;
107105
}
108106
}
109107

sdk/python/feast/field.py

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -171,13 +171,7 @@ def from_proto(cls, field_proto: FieldProto):
171171
dtype = Array(inner_struct)
172172
user_tags = {k: v for k, v in tags.items() if k not in internal_tags}
173173
elif (
174-
value_type
175-
in (
176-
ValueType.LIST_LIST,
177-
ValueType.LIST_SET,
178-
ValueType.SET_LIST,
179-
ValueType.SET_SET,
180-
)
174+
value_type in (ValueType.VALUE_LIST, ValueType.VALUE_SET)
181175
and NESTED_COLLECTION_INNER_TYPE_TAG in tags
182176
):
183177
dtype = _str_to_feast_type(tags[NESTED_COLLECTION_INNER_TYPE_TAG])

sdk/python/feast/infra/online_stores/remote.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ def _proto_value_to_transport_value(proto_value: ValueProto) -> Any:
9696

9797
# Nested collection types use feast_value_type_to_python_type
9898
# which handles recursive conversion of RepeatedValue protos.
99-
if val_attr in ("list_list_val", "list_set_val", "set_list_val", "set_set_val"):
99+
if val_attr in ("list_val", "set_val"):
100100
return feast_value_type_to_python_type(proto_value)
101101

102102
# Map/Struct types are converted to Python dicts by

sdk/python/feast/proto_json.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ def to_json_object(printer: _Printer, message: ProtoMessage) -> JsonObject:
6363
# to JSON. The parse back result will be different from original message.
6464
if which is None or which == "null_val":
6565
return None
66-
elif which in ("list_list_val", "list_set_val", "set_list_val", "set_set_val"):
66+
elif which in ("list_val", "set_val"):
6767
# Nested collection: RepeatedValue containing Values
6868
repeated = getattr(message, which)
6969
value = [
@@ -96,13 +96,13 @@ def from_json_object(
9696
# Nested collection (list of lists).
9797
# Check any() to handle cases where the first element is None
9898
# (empty inner collections round-trip through proto as None).
99-
# Default to list_list_val since JSON transport loses the
99+
# Default to list_val since JSON transport loses the
100100
# outer/inner set distinction.
101101
rv = RepeatedValue()
102102
for inner in value:
103103
inner_val = rv.val.add()
104104
from_json_object(parser, inner, inner_val)
105-
message.list_list_val.CopyFrom(rv)
105+
message.list_val.CopyFrom(rv)
106106
elif isinstance(value[0], bool):
107107
message.bool_list_val.val.extend(value)
108108
elif isinstance(value[0], str):

sdk/python/feast/protos/feast/core/Aggregation_pb2.py

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

sdk/python/feast/protos/feast/core/Aggregation_pb2.pyi

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,11 @@ class Aggregation(google.protobuf.message.Message):
2525
NAME_FIELD_NUMBER: builtins.int
2626
column: builtins.str
2727
function: builtins.str
28-
name: builtins.str
2928
@property
3029
def time_window(self) -> google.protobuf.duration_pb2.Duration: ...
3130
@property
3231
def slide_interval(self) -> google.protobuf.duration_pb2.Duration: ...
32+
name: builtins.str
3333
def __init__(
3434
self,
3535
*,

sdk/python/feast/protos/feast/core/FeatureView_pb2.pyi

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -118,14 +118,18 @@ class FeatureViewSpec(google.protobuf.message.Message):
118118
"""
119119
@property
120120
def batch_source(self) -> feast.core.DataSource_pb2.DataSource:
121-
"""Batch/Offline DataSource where this view can retrieve offline feature data."""
121+
"""Batch/Offline DataSource where this view can retrieve offline feature data.
122+
Optional: if not set, the feature view has no associated batch data source (e.g. purely derived views).
123+
"""
122124
online: builtins.bool
123125
"""Whether these features should be served online or not
124126
This is also used to determine whether the features should be written to the online store
125127
"""
126128
@property
127129
def stream_source(self) -> feast.core.DataSource_pb2.DataSource:
128-
"""Streaming DataSource from where this view can consume "online" feature data."""
130+
"""Streaming DataSource from where this view can consume "online" feature data.
131+
Optional: only required for streaming feature views.
132+
"""
129133
description: builtins.str
130134
"""Description of the feature view."""
131135
owner: builtins.str

sdk/python/feast/protos/feast/serving/GrpcServer_pb2.pyi

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ isort:skip_file
44
"""
55
import builtins
66
import collections.abc
7-
import feast.protos.feast.types.Value_pb2
7+
import feast.types.Value_pb2
88
import google.protobuf.descriptor
99
import google.protobuf.internal.containers
1010
import google.protobuf.message
@@ -42,12 +42,12 @@ class PushRequest(google.protobuf.message.Message):
4242
VALUE_FIELD_NUMBER: builtins.int
4343
key: builtins.str
4444
@property
45-
def value(self) -> feast.protos.feast.types.Value_pb2.Value: ...
45+
def value(self) -> feast.types.Value_pb2.Value: ...
4646
def __init__(
4747
self,
4848
*,
4949
key: builtins.str = ...,
50-
value: feast.protos.feast.types.Value_pb2.Value | None = ...,
50+
value: feast.types.Value_pb2.Value | None = ...,
5151
) -> None: ...
5252
def HasField(self, field_name: typing_extensions.Literal["value", b"value"]) -> builtins.bool: ...
5353
def ClearField(self, field_name: typing_extensions.Literal["key", b"key", "value", b"value"]) -> None: ...
@@ -63,15 +63,15 @@ class PushRequest(google.protobuf.message.Message):
6363
allow_registry_cache: builtins.bool
6464
to: builtins.str
6565
@property
66-
def typed_features(self) -> google.protobuf.internal.containers.MessageMap[builtins.str, feast.protos.feast.types.Value_pb2.Value]: ...
66+
def typed_features(self) -> google.protobuf.internal.containers.MessageMap[builtins.str, feast.types.Value_pb2.Value]: ...
6767
def __init__(
6868
self,
6969
*,
7070
features: collections.abc.Mapping[builtins.str, builtins.str] | None = ...,
7171
stream_feature_view: builtins.str = ...,
7272
allow_registry_cache: builtins.bool = ...,
7373
to: builtins.str = ...,
74-
typed_features: collections.abc.Mapping[builtins.str, feast.protos.feast.types.Value_pb2.Value] | None = ...,
74+
typed_features: collections.abc.Mapping[builtins.str, feast.types.Value_pb2.Value] | None = ...,
7575
) -> None: ...
7676
def ClearField(self, field_name: typing_extensions.Literal["allow_registry_cache", b"allow_registry_cache", "features", b"features", "stream_feature_view", b"stream_feature_view", "to", b"to", "typed_features", b"typed_features"]) -> None: ...
7777

@@ -116,12 +116,12 @@ class WriteToOnlineStoreRequest(google.protobuf.message.Message):
116116
VALUE_FIELD_NUMBER: builtins.int
117117
key: builtins.str
118118
@property
119-
def value(self) -> feast.protos.feast.types.Value_pb2.Value: ...
119+
def value(self) -> feast.types.Value_pb2.Value: ...
120120
def __init__(
121121
self,
122122
*,
123123
key: builtins.str = ...,
124-
value: feast.protos.feast.types.Value_pb2.Value | None = ...,
124+
value: feast.types.Value_pb2.Value | None = ...,
125125
) -> None: ...
126126
def HasField(self, field_name: typing_extensions.Literal["value", b"value"]) -> builtins.bool: ...
127127
def ClearField(self, field_name: typing_extensions.Literal["key", b"key", "value", b"value"]) -> None: ...
@@ -135,14 +135,14 @@ class WriteToOnlineStoreRequest(google.protobuf.message.Message):
135135
feature_view_name: builtins.str
136136
allow_registry_cache: builtins.bool
137137
@property
138-
def typed_features(self) -> google.protobuf.internal.containers.MessageMap[builtins.str, feast.protos.feast.types.Value_pb2.Value]: ...
138+
def typed_features(self) -> google.protobuf.internal.containers.MessageMap[builtins.str, feast.types.Value_pb2.Value]: ...
139139
def __init__(
140140
self,
141141
*,
142142
features: collections.abc.Mapping[builtins.str, builtins.str] | None = ...,
143143
feature_view_name: builtins.str = ...,
144144
allow_registry_cache: builtins.bool = ...,
145-
typed_features: collections.abc.Mapping[builtins.str, feast.protos.feast.types.Value_pb2.Value] | None = ...,
145+
typed_features: collections.abc.Mapping[builtins.str, feast.types.Value_pb2.Value] | None = ...,
146146
) -> None: ...
147147
def ClearField(self, field_name: typing_extensions.Literal["allow_registry_cache", b"allow_registry_cache", "feature_view_name", b"feature_view_name", "features", b"features", "typed_features", b"typed_features"]) -> None: ...
148148

0 commit comments

Comments
 (0)