Skip to content

Commit 4c8502e

Browse files
soooojinleeclaude
andcommitted
refactor: Replace combinatorial nested collection enums with recursive VALUE_LIST/VALUE_SET
Replace 4 combinatorial enum values (LIST_LIST=36, LIST_SET=37, SET_LIST=38, SET_SET=39) with 2 recursive enum values (VALUE_LIST=40, VALUE_SET=41) that use RepeatedValue to enable unlimited nesting depth. This is a breaking change for an unreleased feature, as suggested in PR feast-dev#6132 review. Key changes: - Proto: Remove 4 enum/oneof fields, add VALUE_LIST/VALUE_SET with reserved 36-39 - Python: Update ValueType enum, type system, serialization, field persistence - JSON: Update proto_json encode/decode for new field names - Tests: Rewrite all nested collection tests (204 tests passing) - Docs: Update type-system.md for recursive design Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io>
1 parent 4097ca5 commit 4c8502e

13 files changed

Lines changed: 163 additions & 276 deletions

File tree

docs/reference/type-system.md

Lines changed: 14 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -88,21 +88,21 @@ All primitive types (except `Map` and `Json`) have corresponding set types for s
8888

8989
### Nested Collection Types
9090

91-
Feast supports 2-level nested collections, combining Array and Set types:
91+
Feast supports arbitrarily nested collections using a recursive `VALUE_LIST` / `VALUE_SET` design. The outer container determines the proto enum (`VALUE_LIST` for `Array(…)`, `VALUE_SET` for `Set(…)`), while the full inner type structure is persisted via a mandatory `feast:nested_inner_type` Field tag.
9292

9393
| Feast Type | Python Type | ValueType | Description |
9494
|------------|-------------|-----------|-------------|
95-
| `Array(Array(T))` | `List[List[T]]` | `LIST_LIST` | List of lists |
96-
| `Array(Set(T))` | `List[List[T]]` | `LIST_SET` | List of sets (inner elements deduplicated) |
97-
| `Set(Array(T))` | `List[List[T]]` | `SET_LIST` | Set of lists |
98-
| `Set(Set(T))` | `List[List[T]]` | `SET_SET` | Set of sets (inner elements deduplicated) |
95+
| `Array(Array(T))` | `List[List[T]]` | `VALUE_LIST` | List of lists |
96+
| `Array(Set(T))` | `List[List[T]]` | `VALUE_LIST` | List of sets |
97+
| `Set(Array(T))` | `List[List[T]]` | `VALUE_SET` | Set of lists |
98+
| `Set(Set(T))` | `List[List[T]]` | `VALUE_SET` | Set of sets |
99+
| `Array(Array(Array(T)))` | `List[List[List[T]]]` | `VALUE_LIST` | 3-level nesting |
99100

100-
Where `T` is any supported primitive type (Int32, Int64, Float32, Float64, String, Bytes, Bool, UnixTimestamp).
101+
Where `T` is any supported primitive type (Int32, Int64, Float32, Float64, String, Bytes, Bool, UnixTimestamp) or another nested collection type.
101102

102103
**Notes:**
103-
- Nesting is limited to 2 levels. `Array(Array(Array(T)))` will raise a `ValueError`.
104-
- Inner type information is preserved via Field tags (`feast:nested_inner_type`) and restored during deserialization.
105-
- For `Array(Set(T))` and `Set(Set(T))`, inner collection elements are automatically deduplicated.
104+
- Nesting depth is **unlimited**. `Array(Array(Array(T)))`, `Set(Array(Set(T)))`, etc. are all supported.
105+
- Inner type information is preserved via Field tags (`feast:nested_inner_type`) and restored during deserialization. This tag is mandatory for nested collection types.
106106
- Empty inner collections (`[]`) are stored as empty proto values and round-trip as `None`. For example, `[[1, 2], [], [3]]` becomes `[[1, 2], None, [3]]` after a write-read cycle.
107107

108108
### Map Types
@@ -315,23 +315,20 @@ unique_devices = {uuid.uuid4(), uuid.uuid4()}
315315

316316
### Nested Collection Type Usage Examples
317317

318-
Nested collections allow storing multi-dimensional data:
318+
Nested collections allow storing multi-dimensional data with unlimited depth:
319319

320320
```python
321321
# List of lists — e.g., weekly score history per user
322322
weekly_scores = [[85.0, 90.5, 78.0], [92.0, 88.5], [95.0, 91.0, 87.5]]
323323

324324
# List of sets — e.g., unique tags assigned per category
325325
unique_tags_per_category = [["python", "ml"], ["rust", "systems"], ["python", "web"]]
326-
# Inner sets are automatically deduplicated:
327-
# [["python", "ml"], ...] (duplicates within each inner set are removed)
328326

329-
# Set of lists — e.g., distinct ordered sequences observed
330-
distinct_sequences = [[1, 2, 3], [4, 5], [1, 2, 3]]
327+
# 3-level nesting — e.g., multi-dimensional matrices
328+
Field(name="tensor", dtype=Array(Array(Array(Float64))))
331329

332-
# Set of sets — e.g., distinct groups of unique items
333-
distinct_groups = [["a", "b"], ["c", "d"], ["a", "b"]]
334-
# Inner elements are deduplicated within each set
330+
# Mixed nesting
331+
Field(name="grouped_tags", dtype=Array(Set(Array(String))))
335332
```
336333

337334
**Limitation:** Empty inner collections round-trip as `None`:

protos/feast/types/Value.proto

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -63,10 +63,8 @@ message ValueType {
6363
TIME_UUID_LIST = 39;
6464
UUID_SET = 40;
6565
TIME_UUID_SET = 41;
66-
LIST_LIST = 42;
67-
LIST_SET = 43;
68-
SET_LIST = 44;
69-
SET_SET = 45;
66+
VALUE_LIST = 42;
67+
VALUE_SET = 43;
7068
}
7169
}
7270

@@ -112,10 +110,8 @@ message Value {
112110
StringList time_uuid_list_val = 39;
113111
StringSet uuid_set_val = 40;
114112
StringSet time_uuid_set_val = 41;
115-
RepeatedValue list_list_val = 42;
116-
RepeatedValue list_set_val = 43;
117-
RepeatedValue set_list_val = 44;
118-
RepeatedValue set_set_val = 45;
113+
RepeatedValue list_val = 42;
114+
RepeatedValue set_val = 43;
119115
}
120116
}
121117

sdk/python/feast/field.py

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -171,13 +171,7 @@ def from_proto(cls, field_proto: FieldProto):
171171
dtype = Array(inner_struct)
172172
user_tags = {k: v for k, v in tags.items() if k not in internal_tags}
173173
elif (
174-
value_type
175-
in (
176-
ValueType.LIST_LIST,
177-
ValueType.LIST_SET,
178-
ValueType.SET_LIST,
179-
ValueType.SET_SET,
180-
)
174+
value_type in (ValueType.VALUE_LIST, ValueType.VALUE_SET)
181175
and NESTED_COLLECTION_INNER_TYPE_TAG in tags
182176
):
183177
dtype = _str_to_feast_type(tags[NESTED_COLLECTION_INNER_TYPE_TAG])

sdk/python/feast/infra/online_stores/remote.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ def _proto_value_to_transport_value(proto_value: ValueProto) -> Any:
108108

109109
# Nested collection types use feast_value_type_to_python_type
110110
# which handles recursive conversion of RepeatedValue protos.
111-
if val_attr in ("list_list_val", "list_set_val", "set_list_val", "set_set_val"):
111+
if val_attr in ("list_val", "set_val"):
112112
return feast_value_type_to_python_type(proto_value)
113113

114114
# Map/Struct types are converted to Python dicts by

sdk/python/feast/proto_json.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ def to_json_object(printer: _Printer, message: ProtoMessage) -> JsonObject:
6363
# to JSON. The parse back result will be different from original message.
6464
if which is None or which == "null_val":
6565
return None
66-
elif which in ("list_list_val", "list_set_val", "set_list_val", "set_set_val"):
66+
elif which in ("list_val", "set_val"):
6767
# Nested collection: RepeatedValue containing Values
6868
repeated = getattr(message, which)
6969
value = [
@@ -96,13 +96,13 @@ def from_json_object(
9696
# Nested collection (list of lists).
9797
# Check any() to handle cases where the first element is None
9898
# (empty inner collections round-trip through proto as None).
99-
# Default to list_list_val since JSON transport loses the
99+
# Default to list_val since JSON transport loses the
100100
# outer/inner set distinction.
101101
rv = RepeatedValue()
102102
for inner in value:
103103
inner_val = rv.val.add()
104104
from_json_object(parser, inner, inner_val)
105-
message.list_list_val.CopyFrom(rv)
105+
message.list_val.CopyFrom(rv)
106106
elif isinstance(value[0], bool):
107107
message.bool_list_val.val.extend(value)
108108
elif isinstance(value[0], str):

0 commit comments

Comments
 (0)