Skip to content

Commit 180434b

Browse files
soooojinleeclaude
andcommitted
feat: Add UUID_SET/TIME_UUID_SET support and update type system docs
Add Set(Uuid) and Set(TimeUuid) as feature types with full roundtrip support, backward compatibility, and documentation for all UUID types. Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent ada47fa commit 180434b

File tree

13 files changed

+231
-67
lines changed

13 files changed

+231
-67
lines changed

docs/getting-started/concepts/feast-types.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ To make this possible, Feast itself has a type system for all the types it is ab
55

66
Feast's type system is built on top of [protobuf](https://github.com/protocolbuffers/protobuf). The messages that make up the type system can be found [here](https://github.com/feast-dev/feast/blob/master/protos/feast/types/Value.proto), and the corresponding python classes that wrap them can be found [here](https://github.com/feast-dev/feast/blob/master/sdk/python/feast/types.py).
77

8-
Feast supports primitive data types (numerical values, strings, bytes, booleans and timestamps). The only complex data type Feast supports is Arrays, and arrays cannot contain other arrays.
8+
Feast supports primitive data types (numerical values, strings, bytes, booleans, timestamps, and UUIDs). Feast also supports complex data types: Arrays, Sets, and Maps. Arrays and Sets cannot contain other Arrays or Sets. For a complete reference of all supported types, see the [Type System reference](../../reference/type-system.md).
99

1010
Each feature or schema field in Feast is associated with a data type, which is stored in Feast's [registry](registry.md). These types are also used to ensure that Feast operates on values correctly (e.g. making sure that timestamp columns used for [point-in-time correct joins](point-in-time-joins.md) actually have the timestamp type).
1111

docs/reference/type-system.md

Lines changed: 44 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ Feast supports the following data types:
2424
| `Bytes` | `bytes` | Binary data |
2525
| `Bool` | `bool` | Boolean value |
2626
| `UnixTimestamp` | `datetime` | Unix timestamp (nullable) |
27+
| `Uuid` | `uuid.UUID` | UUID (any version) |
28+
| `TimeUuid` | `uuid.UUID` | Time-based UUID (version 1) |
2729

2830
### Array Types
2931

@@ -39,6 +41,8 @@ All primitive types have corresponding array (list) types:
3941
| `Array(Bytes)` | `List[bytes]` | List of binary data |
4042
| `Array(Bool)` | `List[bool]` | List of booleans |
4143
| `Array(UnixTimestamp)` | `List[datetime]` | List of timestamps |
44+
| `Array(Uuid)` | `List[uuid.UUID]` | List of UUIDs |
45+
| `Array(TimeUuid)` | `List[uuid.UUID]` | List of time-based UUIDs |
4246

4347
### Set Types
4448

@@ -54,6 +58,8 @@ All primitive types (except Map) have corresponding set types for storing unique
5458
| `Set(Bytes)` | `Set[bytes]` | Set of unique binary data |
5559
| `Set(Bool)` | `Set[bool]` | Set of unique booleans |
5660
| `Set(UnixTimestamp)` | `Set[datetime]` | Set of unique timestamps |
61+
| `Set(Uuid)` | `Set[uuid.UUID]` | Set of unique UUIDs |
62+
| `Set(TimeUuid)` | `Set[uuid.UUID]` | Set of unique time-based UUIDs |
5763

5864
**Note:** Set types automatically remove duplicate values. When converting from lists or other iterables to sets, duplicates are eliminated.
5965

@@ -77,7 +83,7 @@ from datetime import timedelta
7783
from feast import Entity, FeatureView, Field, FileSource
7884
from feast.types import (
7985
Int32, Int64, Float32, Float64, String, Bytes, Bool, UnixTimestamp,
80-
Array, Set, Map
86+
Uuid, TimeUuid, Array, Set, Map
8187
)
8288

8389
# Define a data source
@@ -107,7 +113,9 @@ user_features = FeatureView(
107113
Field(name="profile_picture", dtype=Bytes),
108114
Field(name="is_active", dtype=Bool),
109115
Field(name="last_login", dtype=UnixTimestamp),
110-
116+
Field(name="session_id", dtype=Uuid),
117+
Field(name="event_id", dtype=TimeUuid),
118+
111119
# Array types
112120
Field(name="daily_steps", dtype=Array(Int32)),
113121
Field(name="transaction_history", dtype=Array(Int64)),
@@ -117,13 +125,17 @@ user_features = FeatureView(
117125
Field(name="document_hashes", dtype=Array(Bytes)),
118126
Field(name="notification_settings", dtype=Array(Bool)),
119127
Field(name="login_timestamps", dtype=Array(UnixTimestamp)),
120-
128+
Field(name="related_session_ids", dtype=Array(Uuid)),
129+
Field(name="event_chain", dtype=Array(TimeUuid)),
130+
121131
# Set types (unique values only)
122132
Field(name="visited_pages", dtype=Set(String)),
123133
Field(name="unique_categories", dtype=Set(Int32)),
124134
Field(name="tag_ids", dtype=Set(Int64)),
125135
Field(name="preferred_languages", dtype=Set(String)),
126-
136+
Field(name="unique_device_ids", dtype=Set(Uuid)),
137+
Field(name="unique_event_ids", dtype=Set(TimeUuid)),
138+
127139
# Map types
128140
Field(name="user_preferences", dtype=Map),
129141
Field(name="metadata", dtype=Map),
@@ -151,6 +163,34 @@ tag_list = [100, 200, 300, 100, 200]
151163
tag_ids = set(tag_list) # {100, 200, 300}
152164
```
153165

166+
### UUID Type Usage Examples
167+
168+
UUID types store universally unique identifiers natively, with support for both random UUIDs and time-based UUIDs:
169+
170+
```python
171+
import uuid
172+
173+
# Random UUID (version 4) — use Uuid type
174+
session_id = uuid.uuid4() # e.g., UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')
175+
176+
# Time-based UUID (version 1) — use TimeUuid type
177+
event_id = uuid.uuid1() # e.g., UUID('6ba7b810-9dad-11d1-80b4-00c04fd430c8')
178+
179+
# UUID values are returned as uuid.UUID objects from get_online_features()
180+
response = store.get_online_features(
181+
features=["user_features:session_id"],
182+
entity_rows=[{"user_id": 1}],
183+
)
184+
result = response.to_dict()
185+
# result["session_id"][0] is a uuid.UUID object
186+
187+
# UUID lists
188+
related_sessions = [uuid.uuid4(), uuid.uuid4(), uuid.uuid4()]
189+
190+
# UUID sets (unique values)
191+
unique_devices = {uuid.uuid4(), uuid.uuid4()}
192+
```
193+
154194
### Map Type Usage Examples
155195

156196
Maps can store complex nested data structures:

protos/feast/types/Value.proto

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,8 @@ message ValueType {
5757
TIME_UUID = 31;
5858
UUID_LIST = 32;
5959
TIME_UUID_LIST = 33;
60+
UUID_SET = 34;
61+
TIME_UUID_SET = 35;
6062
}
6163
}
6264

@@ -96,6 +98,8 @@ message Value {
9698
string time_uuid_val = 31;
9799
StringList uuid_list_val = 32;
98100
StringList time_uuid_list_val = 33;
101+
StringSet uuid_set_val = 34;
102+
StringSet time_uuid_set_val = 35;
99103
}
100104
}
101105

sdk/python/feast/infra/online_stores/remote.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,9 +40,11 @@
4040

4141

4242
def _json_safe(val: Any) -> Any:
43-
"""Convert uuid.UUID objects to strings for JSON serialization."""
43+
"""Convert uuid.UUID objects and sets to JSON-serializable form."""
4444
if isinstance(val, uuid_module.UUID):
4545
return str(val)
46+
if isinstance(val, set):
47+
return [str(v) if isinstance(v, uuid_module.UUID) else v for v in val]
4648
if isinstance(val, list):
4749
return [str(v) if isinstance(v, uuid_module.UUID) else v for v in val]
4850
return val

sdk/python/feast/on_demand_feature_view.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1108,6 +1108,9 @@ def _get_sample_values_by_type(self) -> dict[ValueType, list[Any]]:
11081108
ValueType.UNIX_TIMESTAMP_LIST: [[_utc_now()]],
11091109
ValueType.UUID_LIST: [[uuid.uuid4(), uuid.uuid4()]],
11101110
ValueType.TIME_UUID_LIST: [[uuid.uuid1(), uuid.uuid1()]],
1111+
# Set types
1112+
ValueType.UUID_SET: [{uuid.uuid4(), uuid.uuid4()}],
1113+
ValueType.TIME_UUID_SET: [{uuid.uuid1(), uuid.uuid1()}],
11111114
}
11121115

11131116
@staticmethod

sdk/python/feast/online_response.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,14 @@ def to_arrow(self, include_event_timestamps: bool = False) -> pa.Table:
122122
else v
123123
for v in values
124124
]
125+
elif isinstance(first_valid, set):
126+
inner = next((e for e in first_valid if e is not None), None)
127+
if isinstance(inner, uuid_module.UUID):
128+
result[key] = [
129+
[str(e) for e in v] if isinstance(v, set) else v for v in values
130+
]
131+
else:
132+
result[key] = [list(v) if isinstance(v, set) else v for v in values]
125133
return pa.Table.from_pydict(result)
126134

127135
def to_tensor(

0 commit comments

Comments
 (0)