Skip to content

Commit 90cd101

Browse files
committed
feat: Add decimal to supported feature types #6029
Signed-off-by: Nick Quinn <nicholas_quinn@apple.com>
1 parent 4dac5b2 commit 90cd101

40 files changed

+5626
-4510
lines changed

docs/reference/type-system.md

Lines changed: 45 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ Feast supports the following data types:
2626
| `UnixTimestamp` | `datetime` | Unix timestamp (nullable) |
2727
| `Uuid` | `uuid.UUID` | UUID (any version) |
2828
| `TimeUuid` | `uuid.UUID` | Time-based UUID (version 1) |
29+
| `Decimal` | `decimal.Decimal` | Arbitrary-precision decimal number |
2930

3031
### Domain-Specific Primitive Types
3132

@@ -56,6 +57,7 @@ All primitive types have corresponding array (list) types:
5657
| `Array(UnixTimestamp)` | `List[datetime]` | List of timestamps |
5758
| `Array(Uuid)` | `List[uuid.UUID]` | List of UUIDs |
5859
| `Array(TimeUuid)` | `List[uuid.UUID]` | List of time-based UUIDs |
60+
| `Array(Decimal)` | `List[decimal.Decimal]` | List of arbitrary-precision decimals |
5961

6062
### Set Types
6163

@@ -73,6 +75,7 @@ All primitive types (except `Map` and `Json`) have corresponding set types for s
7375
| `Set(UnixTimestamp)` | `Set[datetime]` | Set of unique timestamps |
7476
| `Set(Uuid)` | `Set[uuid.UUID]` | Set of unique UUIDs |
7577
| `Set(TimeUuid)` | `Set[uuid.UUID]` | Set of unique time-based UUIDs |
78+
| `Set(Decimal)` | `Set[decimal.Decimal]` | Set of unique arbitrary-precision decimals |
7679

7780
**Note:** Set types automatically remove duplicate values. When converting from lists or other iterables to sets, duplicates are eliminated.
7881

@@ -194,7 +197,7 @@ from datetime import timedelta
194197
from feast import Entity, FeatureView, Field, FileSource
195198
from feast.types import (
196199
Int32, Int64, Float32, Float64, String, Bytes, Bool, UnixTimestamp,
197-
Uuid, TimeUuid, Array, Set, Map, Json, Struct
200+
Uuid, TimeUuid, Decimal, Array, Set, Map, Json, Struct
198201
)
199202

200203
# Define a data source
@@ -226,6 +229,7 @@ user_features = FeatureView(
226229
Field(name="last_login", dtype=UnixTimestamp),
227230
Field(name="session_id", dtype=Uuid),
228231
Field(name="event_id", dtype=TimeUuid),
232+
Field(name="price", dtype=Decimal),
229233

230234
# Array types
231235
Field(name="daily_steps", dtype=Array(Int32)),
@@ -238,6 +242,7 @@ user_features = FeatureView(
238242
Field(name="login_timestamps", dtype=Array(UnixTimestamp)),
239243
Field(name="related_session_ids", dtype=Array(Uuid)),
240244
Field(name="event_chain", dtype=Array(TimeUuid)),
245+
Field(name="historical_prices", dtype=Array(Decimal)),
241246

242247
# Set types (unique values only — see backend caveats above)
243248
Field(name="visited_pages", dtype=Set(String)),
@@ -246,6 +251,7 @@ user_features = FeatureView(
246251
Field(name="preferred_languages", dtype=Set(String)),
247252
Field(name="unique_device_ids", dtype=Set(Uuid)),
248253
Field(name="unique_event_ids", dtype=Set(TimeUuid)),
254+
Field(name="unique_prices", dtype=Set(Decimal)),
249255

250256
# Map types
251257
Field(name="user_preferences", dtype=Map),
@@ -313,9 +319,44 @@ related_sessions = [uuid.uuid4(), uuid.uuid4(), uuid.uuid4()]
313319
unique_devices = {uuid.uuid4(), uuid.uuid4()}
314320
```
315321

316-
### Nested Collection Type Usage Examples
322+
### Decimal Type Usage Examples
323+
324+
The `Decimal` type stores arbitrary-precision decimal numbers using Python's `decimal.Decimal`.
325+
Values are stored as strings in the proto to preserve full precision — no floating-point rounding occurs.
326+
327+
```python
328+
import decimal
329+
330+
# Scalar decimal — e.g., a financial price
331+
price = decimal.Decimal("19.99")
332+
333+
# High-precision value — all digits preserved
334+
tax_rate = decimal.Decimal("0.08750000000000000000")
335+
336+
# Decimal values are returned as decimal.Decimal objects from get_online_features()
337+
response = store.get_online_features(
338+
features=["product_features:price"],
339+
entity_rows=[{"product_id": 42}],
340+
)
341+
result = response.to_dict()
342+
# result["price"][0] is a decimal.Decimal object
343+
344+
# Decimal lists — e.g., a history of prices
345+
historical_prices = [
346+
decimal.Decimal("18.50"),
347+
decimal.Decimal("19.00"),
348+
decimal.Decimal("19.99"),
349+
]
350+
351+
# Decimal sets — unique price points seen
352+
unique_prices = {decimal.Decimal("9.99"), decimal.Decimal("19.99"), decimal.Decimal("29.99")}
353+
```
317354

318-
Nested collections allow storing multi-dimensional data with unlimited depth:
355+
{% hint style="warning" %}
356+
`Decimal` is **not** inferred from any backend schema. You must declare it explicitly in your feature view schema. The pandas dtype for `Decimal` columns is `object` (holding `decimal.Decimal` instances), not a numeric dtype.
357+
{% endhint %}
358+
359+
### Nested Collection Type Usage Examples
319360

320361
```python
321362
# List of lists — e.g., weekly score history per user
@@ -420,7 +461,7 @@ Each of these columns must be associated with a Feast type, which requires conve
420461
* `source_datatype_to_feast_value_type` calls the appropriate method in `type_map.py`. For example, if a `SnowflakeSource` is being examined, `snowflake_python_type_to_feast_value_type` from `type_map.py` will be called.
421462

422463
{% hint style="info" %}
423-
**Types that cannot be inferred:** `Set`, `Json`, `Struct`, `PdfBytes`, and `ImageBytes` types are never inferred from backend schemas. If you use these types, you must declare them explicitly in your feature view schema.
464+
**Types that cannot be inferred:** `Set`, `Json`, `Struct`, `Decimal`, `PdfBytes`, and `ImageBytes` types are never inferred from backend schemas. If you use these types, you must declare them explicitly in your feature view schema.
424465
{% endhint %}
425466

426467
### Materialization

protos/feast/types/Value.proto

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,9 @@ message ValueType {
6565
TIME_UUID_SET = 41;
6666
VALUE_LIST = 42;
6767
VALUE_SET = 43;
68+
DECIMAL = 44;
69+
DECIMAL_LIST = 45;
70+
DECIMAL_SET = 46;
6871
}
6972
}
7073

@@ -112,6 +115,9 @@ message Value {
112115
StringSet time_uuid_set_val = 41;
113116
RepeatedValue list_val = 42;
114117
RepeatedValue set_val = 43;
118+
string decimal_val = 44;
119+
StringList decimal_list_val = 45;
120+
StringSet decimal_set_val = 46;
115121
}
116122
}
117123

sdk/python/feast/protos/feast/core/Aggregation_pb2.pyi

Lines changed: 35 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -2,44 +2,49 @@
22
@generated by mypy-protobuf. Do not edit manually!
33
isort:skip_file
44
"""
5-
import builtins
6-
import google.protobuf.descriptor
7-
import google.protobuf.duration_pb2
8-
import google.protobuf.message
5+
6+
from google.protobuf import descriptor as _descriptor
7+
from google.protobuf import duration_pb2 as _duration_pb2
8+
from google.protobuf import message as _message
9+
import builtins as _builtins
910
import sys
11+
import typing as _typing
1012

11-
if sys.version_info >= (3, 8):
12-
import typing as typing_extensions
13+
if sys.version_info >= (3, 10):
14+
from typing import TypeAlias as _TypeAlias
1315
else:
14-
import typing_extensions
16+
from typing_extensions import TypeAlias as _TypeAlias
1517

16-
DESCRIPTOR: google.protobuf.descriptor.FileDescriptor
18+
DESCRIPTOR: _descriptor.FileDescriptor
1719

18-
class Aggregation(google.protobuf.message.Message):
19-
DESCRIPTOR: google.protobuf.descriptor.Descriptor
20+
@_typing.final
21+
class Aggregation(_message.Message):
22+
DESCRIPTOR: _descriptor.Descriptor
2023

21-
COLUMN_FIELD_NUMBER: builtins.int
22-
FUNCTION_FIELD_NUMBER: builtins.int
23-
TIME_WINDOW_FIELD_NUMBER: builtins.int
24-
SLIDE_INTERVAL_FIELD_NUMBER: builtins.int
25-
NAME_FIELD_NUMBER: builtins.int
26-
column: builtins.str
27-
function: builtins.str
28-
@property
29-
def time_window(self) -> google.protobuf.duration_pb2.Duration: ...
30-
@property
31-
def slide_interval(self) -> google.protobuf.duration_pb2.Duration: ...
32-
name: builtins.str
24+
COLUMN_FIELD_NUMBER: _builtins.int
25+
FUNCTION_FIELD_NUMBER: _builtins.int
26+
TIME_WINDOW_FIELD_NUMBER: _builtins.int
27+
SLIDE_INTERVAL_FIELD_NUMBER: _builtins.int
28+
NAME_FIELD_NUMBER: _builtins.int
29+
column: _builtins.str
30+
function: _builtins.str
31+
name: _builtins.str
32+
@_builtins.property
33+
def time_window(self) -> _duration_pb2.Duration: ...
34+
@_builtins.property
35+
def slide_interval(self) -> _duration_pb2.Duration: ...
3336
def __init__(
3437
self,
3538
*,
36-
column: builtins.str = ...,
37-
function: builtins.str = ...,
38-
time_window: google.protobuf.duration_pb2.Duration | None = ...,
39-
slide_interval: google.protobuf.duration_pb2.Duration | None = ...,
40-
name: builtins.str = ...,
39+
column: _builtins.str = ...,
40+
function: _builtins.str = ...,
41+
time_window: _duration_pb2.Duration | None = ...,
42+
slide_interval: _duration_pb2.Duration | None = ...,
43+
name: _builtins.str = ...,
4144
) -> None: ...
42-
def HasField(self, field_name: typing_extensions.Literal["slide_interval", b"slide_interval", "time_window", b"time_window"]) -> builtins.bool: ...
43-
def ClearField(self, field_name: typing_extensions.Literal["column", b"column", "function", b"function", "name", b"name", "slide_interval", b"slide_interval", "time_window", b"time_window"]) -> None: ...
45+
_HasFieldArgType: _TypeAlias = _typing.Literal["slide_interval", b"slide_interval", "time_window", b"time_window"] # noqa: Y015
46+
def HasField(self, field_name: _HasFieldArgType) -> _builtins.bool: ...
47+
_ClearFieldArgType: _TypeAlias = _typing.Literal["column", b"column", "function", b"function", "name", b"name", "slide_interval", b"slide_interval", "time_window", b"time_window"] # noqa: Y015
48+
def ClearField(self, field_name: _ClearFieldArgType) -> None: ...
4449

45-
global___Aggregation = Aggregation
50+
Global___Aggregation: _TypeAlias = Aggregation # noqa: Y015

0 commit comments

Comments
 (0)