Skip to content

Commit df16cbd

Browse files
soooojinleeclaude
authored andcommitted
feat: Add UUID and TIME_UUID as feature types (feast-dev#5885) (feast-dev#5951)
* feat: Add UUID and TIME_UUID as feature types (feast-dev#5885) Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io> * test: Add unit tests for UUID type support Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io> * style: Fix ruff lint and formatting issues Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io> * feat: Add dedicated UUID/TIME_UUID proto fields to Value.proto Add uuid_val, time_uuid_val, uuid_list_val, time_uuid_list_val as dedicated oneof fields in the Value proto message, replacing the previous reuse of string_val/string_list_val. This allows UUID types to be identified from the proto field alone without requiring a feature_types side-channel. Backward compatibility is maintained for data previously stored as string_val. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io> * fix: Address review feedback for UUID type support Signed-off-by: soojin <soojin@dable.io> * fix: Address review feedback for UUID type support Signed-off-by: soojin <soojin@dable.io> * fix: Address review feedback Signed-off-by: soojin <soojin@dable.io> * fix: Convert uuid.UUID to string for Arrow and JSON serialization Signed-off-by: soojin <soojin@dable.io> * feat: Add UUID_SET/TIME_UUID_SET support and update type system docs Add Set(Uuid) and Set(TimeUuid) as feature types with full roundtrip support, backward compatibility, and documentation for all UUID types. Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Preserve PDF_BYTES/IMAGE_BYTES enum values and add missing SET type mappings Keep PDF_BYTES=30 and IMAGE_BYTES=31 at their upstream values instead of renumbering them. Shift UUID types to 32-37 in both proto and Python enum. Also add missing SET type entries in _convert_value_type_str_to_value_type(), convert_array_column(), and _get_sample_values_by_type() for completeness. Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Correct misleading comment in Set.__init__ The comment claimed Sets do not support UUID/TimeUuid but the code intentionally allows them. Updated to reflect actual behavior. Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: Extract UUID Arrow conversion into helper and move import to top Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Handle UUID types in _proto_value_to_transport_value for JSON serialization Return UUID proto fields as plain strings instead of falling through to feast_value_type_to_python_type which converts them to uuid.UUID objects that are not JSON-serializable, causing TypeError during HTTP transport. Signed-off-by: soojin <soojin@dable.io> * chore: Regenerate protobuf files with UUID type support Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io> * fix: Fix mypy type ignore comments for UUID collection conversions Add [misc] error code to type: ignore comments in UUID list/set proto conversion to satisfy mypy's stricter checking. Signed-off-by: Soojin Lee <soooojin.lee@gmail.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io> --------- Signed-off-by: soojin <soojin@dable.io> Signed-off-by: Soojin Lee <soooojin.lee@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: yuanjun220 <1069645408@qq.com>
1 parent 97e3296 commit df16cbd

20 files changed

Lines changed: 620 additions & 116 deletions

File tree

docs/getting-started/concepts/feast-types.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,8 @@ Feast supports the following categories of data types:
99

1010
- **Primitive types**: numerical values (`Int32`, `Int64`, `Float32`, `Float64`), `String`, `Bytes`, `Bool`, and `UnixTimestamp`.
1111
- **Domain-specific primitives**: `PdfBytes` (PDF binary data for RAG/document pipelines) and `ImageBytes` (image binary data for multimodal pipelines). These are semantic aliases over `Bytes` and must be explicitly declared in schema — no backend infers them.
12-
- **Array types**: ordered lists of any primitive type, e.g. `Array(Int64)`, `Array(String)`.
12+
- **UUID types**: `Uuid` and `TimeUuid` for universally unique identifiers. Stored as strings at the proto level but deserialized to `uuid.UUID` objects in Python.
13+
- **Array types**: ordered lists of any primitive type, e.g. `Array(Int64)`, `Array(String)`, `Array(Uuid)`.
1314
- **Set types**: unordered collections of unique values for any primitive type, e.g. `Set(String)`, `Set(Int64)`. Set types are not inferred by any backend and must be explicitly declared. They are best suited for online serving use cases.
1415
- **Map types**: dictionary-like structures with string keys and values that can be any supported Feast type (including nested maps), e.g. `Map`, `Array(Map)`.
1516
- **JSON type**: opaque JSON data stored as a string at the proto level but semantically distinct from `String` — backends use native JSON types (`jsonb`, `VARIANT`, etc.), e.g. `Json`, `Array(Json)`.

docs/reference/type-system.md

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ Feast supports the following data types:
2424
| `Bytes` | `bytes` | Binary data |
2525
| `Bool` | `bool` | Boolean value |
2626
| `UnixTimestamp` | `datetime` | Unix timestamp (nullable) |
27+
| `Uuid` | `uuid.UUID` | UUID (any version) |
28+
| `TimeUuid` | `uuid.UUID` | Time-based UUID (version 1) |
2729

2830
### Domain-Specific Primitive Types
2931

@@ -52,6 +54,8 @@ All primitive types have corresponding array (list) types:
5254
| `Array(Bytes)` | `List[bytes]` | List of binary data |
5355
| `Array(Bool)` | `List[bool]` | List of booleans |
5456
| `Array(UnixTimestamp)` | `List[datetime]` | List of timestamps |
57+
| `Array(Uuid)` | `List[uuid.UUID]` | List of UUIDs |
58+
| `Array(TimeUuid)` | `List[uuid.UUID]` | List of time-based UUIDs |
5559

5660
### Set Types
5761

@@ -67,6 +71,8 @@ All primitive types (except `Map` and `Json`) have corresponding set types for s
6771
| `Set(Bytes)` | `Set[bytes]` | Set of unique binary data |
6872
| `Set(Bool)` | `Set[bool]` | Set of unique booleans |
6973
| `Set(UnixTimestamp)` | `Set[datetime]` | Set of unique timestamps |
74+
| `Set(Uuid)` | `Set[uuid.UUID]` | Set of unique UUIDs |
75+
| `Set(TimeUuid)` | `Set[uuid.UUID]` | Set of unique time-based UUIDs |
7076

7177
**Note:** Set types automatically remove duplicate values. When converting from lists or other iterables to sets, duplicates are eliminated.
7278

@@ -169,7 +175,7 @@ from datetime import timedelta
169175
from feast import Entity, FeatureView, Field, FileSource
170176
from feast.types import (
171177
Int32, Int64, Float32, Float64, String, Bytes, Bool, UnixTimestamp,
172-
Array, Set, Map, Json, Struct
178+
Uuid, TimeUuid, Array, Set, Map, Json, Struct
173179
)
174180

175181
# Define a data source
@@ -199,6 +205,8 @@ user_features = FeatureView(
199205
Field(name="profile_picture", dtype=Bytes),
200206
Field(name="is_active", dtype=Bool),
201207
Field(name="last_login", dtype=UnixTimestamp),
208+
Field(name="session_id", dtype=Uuid),
209+
Field(name="event_id", dtype=TimeUuid),
202210

203211
# Array types
204212
Field(name="daily_steps", dtype=Array(Int32)),
@@ -209,12 +217,16 @@ user_features = FeatureView(
209217
Field(name="document_hashes", dtype=Array(Bytes)),
210218
Field(name="notification_settings", dtype=Array(Bool)),
211219
Field(name="login_timestamps", dtype=Array(UnixTimestamp)),
220+
Field(name="related_session_ids", dtype=Array(Uuid)),
221+
Field(name="event_chain", dtype=Array(TimeUuid)),
212222

213223
# Set types (unique values only — see backend caveats above)
214224
Field(name="visited_pages", dtype=Set(String)),
215225
Field(name="unique_categories", dtype=Set(Int32)),
216226
Field(name="tag_ids", dtype=Set(Int64)),
217227
Field(name="preferred_languages", dtype=Set(String)),
228+
Field(name="unique_device_ids", dtype=Set(Uuid)),
229+
Field(name="unique_event_ids", dtype=Set(TimeUuid)),
218230

219231
# Map types
220232
Field(name="user_preferences", dtype=Map),
@@ -250,6 +262,34 @@ tag_list = [100, 200, 300, 100, 200]
250262
tag_ids = set(tag_list) # {100, 200, 300}
251263
```
252264

265+
### UUID Type Usage Examples
266+
267+
UUID types store universally unique identifiers natively, with support for both random UUIDs and time-based UUIDs:
268+
269+
```python
270+
import uuid
271+
272+
# Random UUID (version 4) — use Uuid type
273+
session_id = uuid.uuid4() # e.g., UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')
274+
275+
# Time-based UUID (version 1) — use TimeUuid type
276+
event_id = uuid.uuid1() # e.g., UUID('6ba7b810-9dad-11d1-80b4-00c04fd430c8')
277+
278+
# UUID values are returned as uuid.UUID objects from get_online_features()
279+
response = store.get_online_features(
280+
features=["user_features:session_id"],
281+
entity_rows=[{"user_id": 1}],
282+
)
283+
result = response.to_dict()
284+
# result["session_id"][0] is a uuid.UUID object
285+
286+
# UUID lists
287+
related_sessions = [uuid.uuid4(), uuid.uuid4(), uuid.uuid4()]
288+
289+
# UUID sets (unique values)
290+
unique_devices = {uuid.uuid4(), uuid.uuid4()}
291+
```
292+
253293
### Map Type Usage Examples
254294

255295
Maps can store complex nested data structures:

protos/feast/types/Value.proto

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,12 @@ message ValueType {
5757
JSON_LIST = 33;
5858
STRUCT = 34;
5959
STRUCT_LIST = 35;
60+
UUID = 36;
61+
TIME_UUID = 37;
62+
UUID_LIST = 38;
63+
TIME_UUID_LIST = 39;
64+
UUID_SET = 40;
65+
TIME_UUID_SET = 41;
6066
}
6167
}
6268

@@ -96,6 +102,12 @@ message Value {
96102
StringList json_list_val = 33;
97103
Map struct_val = 34;
98104
MapList struct_list_val = 35;
105+
string uuid_val = 36;
106+
string time_uuid_val = 37;
107+
StringList uuid_list_val = 38;
108+
StringList time_uuid_list_val = 39;
109+
StringSet uuid_set_val = 40;
110+
StringSet time_uuid_set_val = 41;
99111
}
100112
}
101113

sdk/python/feast/feature_store.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2768,7 +2768,10 @@ def _doc_feature(x):
27682768
online_features_response=online_features_response,
27692769
data=requested_features_data,
27702770
)
2771-
return OnlineResponse(online_features_response)
2771+
feature_types = {
2772+
f.name: f.dtype.to_value_type() for f in requested_feature_view.features
2773+
}
2774+
return OnlineResponse(online_features_response, feature_types=feature_types)
27722775

27732776
def retrieve_online_documents_v2(
27742777
self,
@@ -3058,7 +3061,8 @@ def _retrieve_from_online_store_v2(
30583061
online_features_response.metadata.feature_names.val.extend(
30593062
features_to_request
30603063
)
3061-
return OnlineResponse(online_features_response)
3064+
feature_types = {f.name: f.dtype.to_value_type() for f in table.features}
3065+
return OnlineResponse(online_features_response, feature_types=feature_types)
30623066

30633067
table_entity_values, idxs, output_len = utils._get_unique_entities_from_values(
30643068
entity_key_dict,
@@ -3081,7 +3085,8 @@ def _retrieve_from_online_store_v2(
30813085
data=entity_key_dict,
30823086
)
30833087

3084-
return OnlineResponse(online_features_response)
3088+
feature_types = {f.name: f.dtype.to_value_type() for f in table.features}
3089+
return OnlineResponse(online_features_response, feature_types=feature_types)
30853090

30863091
def serve(
30873092
self,

sdk/python/feast/infra/online_stores/online_store.py

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@
3131
from feast.protos.feast.types.Value_pb2 import Value as ValueProto
3232
from feast.repo_config import RepoConfig
3333
from feast.stream_feature_view import StreamFeatureView
34+
from feast.value_type import ValueType
3435

3536

3637
class OnlineStore(ABC):
@@ -236,18 +237,21 @@ def get_online_features(
236237

237238
track_online_store_read(_time.monotonic() - _read_start)
238239

240+
feature_types = self._build_feature_types(grouped_refs)
241+
239242
if requested_on_demand_feature_views:
240243
utils._augment_response_with_on_demand_transforms(
241244
online_features_response,
242245
feature_refs,
243246
requested_on_demand_feature_views,
244247
full_feature_names,
248+
feature_types=feature_types,
245249
)
246250

247251
utils._drop_unneeded_columns(
248252
online_features_response, requested_result_row_names
249253
)
250-
return OnlineResponse(online_features_response)
254+
return OnlineResponse(online_features_response, feature_types=feature_types)
251255

252256
def _check_versioned_read_support(self, grouped_refs):
253257
"""Raise an error if versioned reads are attempted on unsupported stores."""
@@ -367,18 +371,40 @@ async def query_table(table, requested_features):
367371

368372
track_online_store_read(_time.monotonic() - _read_start)
369373

374+
feature_types = self._build_feature_types(grouped_refs)
375+
370376
if requested_on_demand_feature_views:
371377
utils._augment_response_with_on_demand_transforms(
372378
online_features_response,
373379
feature_refs,
374380
requested_on_demand_feature_views,
375381
full_feature_names,
382+
feature_types=feature_types,
376383
)
377384

378385
utils._drop_unneeded_columns(
379386
online_features_response, requested_result_row_names
380387
)
381-
return OnlineResponse(online_features_response)
388+
return OnlineResponse(online_features_response, feature_types=feature_types)
389+
390+
@staticmethod
391+
def _build_feature_types(
392+
grouped_refs: List,
393+
) -> Dict[str, ValueType]:
394+
"""Build a mapping of feature names to ValueType from grouped feature view refs.
395+
396+
Includes both bare names and prefixed names (feature_view__feature) so that
397+
lookups succeed regardless of the full_feature_names setting.
398+
"""
399+
feature_types: Dict[str, ValueType] = {}
400+
for table, requested_features in grouped_refs:
401+
table_name = table.projection.name_to_use()
402+
for field in table.features:
403+
if field.name in requested_features:
404+
vtype = field.dtype.to_value_type()
405+
feature_types[field.name] = vtype
406+
feature_types[f"{table_name}__{field.name}"] = vtype
407+
return feature_types
382408

383409
@abstractmethod
384410
def update(

sdk/python/feast/infra/online_stores/remote.py

Lines changed: 24 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
# limitations under the License.
1414
import json
1515
import logging
16+
import uuid as uuid_module
1617
from collections import defaultdict
1718
from datetime import datetime
1819
from typing import Any, Callable, Dict, List, Literal, Optional, Sequence, Tuple
@@ -38,6 +39,17 @@
3839
logger = logging.getLogger(__name__)
3940

4041

42+
def _json_safe(val: Any) -> Any:
43+
"""Convert uuid.UUID objects and sets to JSON-serializable form."""
44+
if isinstance(val, uuid_module.UUID):
45+
return str(val)
46+
if isinstance(val, set):
47+
return [str(v) if isinstance(v, uuid_module.UUID) else v for v in val]
48+
if isinstance(val, list):
49+
return [str(v) if isinstance(v, uuid_module.UUID) else v for v in val]
50+
return val
51+
52+
4153
class RemoteOnlineStoreConfig(FeastConfigBaseModel):
4254
"""Remote Online store config for remote online store"""
4355

@@ -103,6 +115,16 @@ def _proto_value_to_transport_value(proto_value: ValueProto) -> Any:
103115
if val_attr in ("map_list_val", "struct_list_val"):
104116
return [json.dumps(v) for v in feast_value_type_to_python_type(proto_value)]
105117

118+
# UUID types are stored as strings in proto — return them directly
119+
# to avoid feast_value_type_to_python_type converting to uuid.UUID
120+
# objects which are not JSON-serializable.
121+
if val_attr in ("uuid_val", "time_uuid_val"):
122+
return getattr(proto_value, val_attr)
123+
if val_attr in ("uuid_list_val", "time_uuid_list_val"):
124+
return list(getattr(proto_value, val_attr).val)
125+
if val_attr in ("uuid_set_val", "time_uuid_set_val"):
126+
return list(getattr(proto_value, val_attr).val)
127+
106128
return feast_value_type_to_python_type(proto_value)
107129

108130
def online_write_batch(
@@ -128,9 +150,8 @@ def online_write_batch(
128150
for join_key, entity_value_proto in zip(
129151
entity_key_proto.join_keys, entity_key_proto.entity_values
130152
):
131-
columnar_data[join_key].append(
132-
feast_value_type_to_python_type(entity_value_proto)
133-
)
153+
val = feast_value_type_to_python_type(entity_value_proto)
154+
columnar_data[join_key].append(_json_safe(val))
134155

135156
# Populate feature values – use transport-safe conversion that
136157
# preserves JSON strings instead of parsing them into dicts.

sdk/python/feast/on_demand_feature_view.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import copy
22
import functools
3+
import uuid
34
import warnings
45
from types import FunctionType
56
from typing import Any, List, Optional, Union, cast
@@ -1162,6 +1163,9 @@ def _get_sample_values_by_type(self) -> dict[ValueType, list[Any]]:
11621163
# Special binary types
11631164
ValueType.PDF_BYTES: [pdf_sample],
11641165
ValueType.IMAGE_BYTES: [image_sample],
1166+
# UUID types
1167+
ValueType.UUID: [uuid.uuid4()],
1168+
ValueType.TIME_UUID: [uuid.uuid1()],
11651169
# List types
11661170
ValueType.BYTES_LIST: [[b"hello world"]],
11671171
ValueType.STRING_LIST: [["hello world"]],
@@ -1171,6 +1175,19 @@ def _get_sample_values_by_type(self) -> dict[ValueType, list[Any]]:
11711175
ValueType.FLOAT_LIST: [[1.0]],
11721176
ValueType.BOOL_LIST: [[True]],
11731177
ValueType.UNIX_TIMESTAMP_LIST: [[_utc_now()]],
1178+
ValueType.UUID_LIST: [[uuid.uuid4(), uuid.uuid4()]],
1179+
ValueType.TIME_UUID_LIST: [[uuid.uuid1(), uuid.uuid1()]],
1180+
# Set types
1181+
ValueType.BYTES_SET: [{b"hello world", b"foo bar"}],
1182+
ValueType.STRING_SET: [{"hello world", "foo bar"}],
1183+
ValueType.INT32_SET: [{1, 2}],
1184+
ValueType.INT64_SET: [{1, 2}],
1185+
ValueType.DOUBLE_SET: [{1.0, 2.0}],
1186+
ValueType.FLOAT_SET: [{1.0, 2.0}],
1187+
ValueType.BOOL_SET: [{True, False}],
1188+
ValueType.UNIX_TIMESTAMP_SET: [{_utc_now()}],
1189+
ValueType.UUID_SET: [{uuid.uuid4(), uuid.uuid4()}],
1190+
ValueType.TIME_UUID_SET: [{uuid.uuid1(), uuid.uuid1()}],
11741191
}
11751192

11761193
@staticmethod

0 commit comments

Comments
 (0)