Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
feat: Add UUID_SET/TIME_UUID_SET support and update type system docs
Add Set(Uuid) and Set(TimeUuid) as feature types with full roundtrip
support, backward compatibility, and documentation for all UUID types.

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
  • Loading branch information
2 people authored and ntkathole committed Apr 1, 2026
commit fe0bfb209de46b7d6d924b4a29fcf2fdae839257
3 changes: 2 additions & 1 deletion docs/getting-started/concepts/feast-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@ Feast supports the following categories of data types:

- **Primitive types**: numerical values (`Int32`, `Int64`, `Float32`, `Float64`), `String`, `Bytes`, `Bool`, and `UnixTimestamp`.
- **Domain-specific primitives**: `PdfBytes` (PDF binary data for RAG/document pipelines) and `ImageBytes` (image binary data for multimodal pipelines). These are semantic aliases over `Bytes` and must be explicitly declared in schema — no backend infers them.
- **Array types**: ordered lists of any primitive type, e.g. `Array(Int64)`, `Array(String)`.
- **UUID types**: `Uuid` and `TimeUuid` for universally unique identifiers. Stored as strings at the proto level but deserialized to `uuid.UUID` objects in Python.
- **Array types**: ordered lists of any primitive type, e.g. `Array(Int64)`, `Array(String)`, `Array(Uuid)`.
- **Set types**: unordered collections of unique values for any primitive type, e.g. `Set(String)`, `Set(Int64)`. Set types are not inferred by any backend and must be explicitly declared. They are best suited for online serving use cases.
- **Map types**: dictionary-like structures with string keys and values that can be any supported Feast type (including nested maps), e.g. `Map`, `Array(Map)`.
- **JSON type**: opaque JSON data stored as a string at the proto level but semantically distinct from `String` — backends use native JSON types (`jsonb`, `VARIANT`, etc.), e.g. `Json`, `Array(Json)`.
Expand Down
42 changes: 41 additions & 1 deletion docs/reference/type-system.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ Feast supports the following data types:
| `Bytes` | `bytes` | Binary data |
| `Bool` | `bool` | Boolean value |
| `UnixTimestamp` | `datetime` | Unix timestamp (nullable) |
| `Uuid` | `uuid.UUID` | UUID (any version) |
| `TimeUuid` | `uuid.UUID` | Time-based UUID (version 1) |

### Domain-Specific Primitive Types

Expand Down Expand Up @@ -52,6 +54,8 @@ All primitive types have corresponding array (list) types:
| `Array(Bytes)` | `List[bytes]` | List of binary data |
| `Array(Bool)` | `List[bool]` | List of booleans |
| `Array(UnixTimestamp)` | `List[datetime]` | List of timestamps |
| `Array(Uuid)` | `List[uuid.UUID]` | List of UUIDs |
| `Array(TimeUuid)` | `List[uuid.UUID]` | List of time-based UUIDs |

### Set Types

Expand All @@ -67,6 +71,8 @@ All primitive types (except `Map` and `Json`) have corresponding set types for s
| `Set(Bytes)` | `Set[bytes]` | Set of unique binary data |
| `Set(Bool)` | `Set[bool]` | Set of unique booleans |
| `Set(UnixTimestamp)` | `Set[datetime]` | Set of unique timestamps |
| `Set(Uuid)` | `Set[uuid.UUID]` | Set of unique UUIDs |
| `Set(TimeUuid)` | `Set[uuid.UUID]` | Set of unique time-based UUIDs |

**Note:** Set types automatically remove duplicate values. When converting from lists or other iterables to sets, duplicates are eliminated.

Expand Down Expand Up @@ -169,7 +175,7 @@ from datetime import timedelta
from feast import Entity, FeatureView, Field, FileSource
from feast.types import (
Int32, Int64, Float32, Float64, String, Bytes, Bool, UnixTimestamp,
Array, Set, Map, Json, Struct
Uuid, TimeUuid, Array, Set, Map, Json, Struct
)

# Define a data source
Expand Down Expand Up @@ -199,6 +205,8 @@ user_features = FeatureView(
Field(name="profile_picture", dtype=Bytes),
Field(name="is_active", dtype=Bool),
Field(name="last_login", dtype=UnixTimestamp),
Field(name="session_id", dtype=Uuid),
Field(name="event_id", dtype=TimeUuid),

# Array types
Field(name="daily_steps", dtype=Array(Int32)),
Expand All @@ -209,12 +217,16 @@ user_features = FeatureView(
Field(name="document_hashes", dtype=Array(Bytes)),
Field(name="notification_settings", dtype=Array(Bool)),
Field(name="login_timestamps", dtype=Array(UnixTimestamp)),
Field(name="related_session_ids", dtype=Array(Uuid)),
Field(name="event_chain", dtype=Array(TimeUuid)),

# Set types (unique values only — see backend caveats above)
Field(name="visited_pages", dtype=Set(String)),
Field(name="unique_categories", dtype=Set(Int32)),
Field(name="tag_ids", dtype=Set(Int64)),
Field(name="preferred_languages", dtype=Set(String)),
Field(name="unique_device_ids", dtype=Set(Uuid)),
Field(name="unique_event_ids", dtype=Set(TimeUuid)),

# Map types
Field(name="user_preferences", dtype=Map),
Expand Down Expand Up @@ -250,6 +262,34 @@ tag_list = [100, 200, 300, 100, 200]
tag_ids = set(tag_list) # {100, 200, 300}
```

### UUID Type Usage Examples

UUID types store universally unique identifiers natively, with support for both random UUIDs and time-based UUIDs:

```python
import uuid

# Random UUID (version 4) — use Uuid type
session_id = uuid.uuid4() # e.g., UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')

# Time-based UUID (version 1) — use TimeUuid type
event_id = uuid.uuid1() # e.g., UUID('6ba7b810-9dad-11d1-80b4-00c04fd430c8')

# UUID values are returned as uuid.UUID objects from get_online_features()
response = store.get_online_features(
features=["user_features:session_id"],
entity_rows=[{"user_id": 1}],
)
result = response.to_dict()
# result["session_id"][0] is a uuid.UUID object

# UUID lists
related_sessions = [uuid.uuid4(), uuid.uuid4(), uuid.uuid4()]

# UUID sets (unique values)
unique_devices = {uuid.uuid4(), uuid.uuid4()}
```

### Map Type Usage Examples

Maps can store complex nested data structures:
Expand Down
4 changes: 4 additions & 0 deletions protos/feast/types/Value.proto
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,8 @@ message ValueType {
TIME_UUID = 37;
UUID_LIST = 38;
TIME_UUID_LIST = 39;
UUID_SET = 40;
TIME_UUID_SET = 41;
}
}

Expand Down Expand Up @@ -104,6 +106,8 @@ message Value {
string time_uuid_val = 37;
StringList uuid_list_val = 38;
StringList time_uuid_list_val = 39;
StringSet uuid_set_val = 40;
StringSet time_uuid_set_val = 41;
}
}

Expand Down
4 changes: 3 additions & 1 deletion sdk/python/feast/infra/online_stores/remote.py
Comment thread
devin-ai-integration[bot] marked this conversation as resolved.
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,11 @@


def _json_safe(val: Any) -> Any:
"""Convert uuid.UUID objects to strings for JSON serialization."""
"""Convert uuid.UUID objects and sets to JSON-serializable form."""
if isinstance(val, uuid_module.UUID):
return str(val)
if isinstance(val, set):
return [str(v) if isinstance(v, uuid_module.UUID) else v for v in val]
if isinstance(val, list):
return [str(v) if isinstance(v, uuid_module.UUID) else v for v in val]
return val
Expand Down
3 changes: 3 additions & 0 deletions sdk/python/feast/on_demand_feature_view.py
Comment thread
soooojinlee marked this conversation as resolved.
Original file line number Diff line number Diff line change
Expand Up @@ -1177,6 +1177,9 @@ def _get_sample_values_by_type(self) -> dict[ValueType, list[Any]]:
ValueType.UNIX_TIMESTAMP_LIST: [[_utc_now()]],
ValueType.UUID_LIST: [[uuid.uuid4(), uuid.uuid4()]],
ValueType.TIME_UUID_LIST: [[uuid.uuid1(), uuid.uuid1()]],
# Set types
ValueType.UUID_SET: [{uuid.uuid4(), uuid.uuid4()}],
ValueType.TIME_UUID_SET: [{uuid.uuid1(), uuid.uuid1()}],
}

@staticmethod
Expand Down
8 changes: 8 additions & 0 deletions sdk/python/feast/online_response.py
Comment thread
devin-ai-integration[bot] marked this conversation as resolved.
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,14 @@ def to_arrow(self, include_event_timestamps: bool = False) -> pa.Table:
else v
for v in values
]
elif isinstance(first_valid, set):
inner = next((e for e in first_valid if e is not None), None)
if isinstance(inner, uuid_module.UUID):
result[key] = [
[str(e) for e in v] if isinstance(v, set) else v for v in values
]
else:
result[key] = [list(v) if isinstance(v, set) else v for v in values]
return pa.Table.from_pydict(result)

def to_tensor(
Expand Down
Loading