You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/getting-started/concepts/feast-types.md
+36-2Lines changed: 36 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,10 +5,44 @@ To make this possible, Feast itself has a type system for all the types it is ab
5
5
6
6
Feast's type system is built on top of [protobuf](https://github.com/protocolbuffers/protobuf). The messages that make up the type system can be found [here](https://github.com/feast-dev/feast/blob/master/protos/feast/types/Value.proto), and the corresponding python classes that wrap them can be found [here](https://github.com/feast-dev/feast/blob/master/sdk/python/feast/types.py).
7
7
8
-
Feast supports primitive data types (numerical values, strings, bytes, booleans and timestamps). The only complex data type Feast supports is Arrays, and arrays cannot contain other arrays.
8
+
Feast supports the following categories of data types:
-**Array types**: ordered lists of any primitive type, e.g. `Array(Int64)`, `Array(String)`.
12
+
-**Set types**: unordered collections of unique values for any primitive type, e.g. `Set(String)`, `Set(Int64)`.
13
+
-**Map types**: dictionary-like structures with string keys and values that can be any supported Feast type (including nested maps), e.g. `Map`, `Array(Map)`.
14
+
-**JSON type**: opaque JSON data stored as a string at the proto level but semantically distinct from `String` — backends use native JSON types (`jsonb`, `VARIANT`, etc.), e.g. `Json`, `Array(Json)`.
15
+
-**Struct type**: schema-aware structured type with named, typed fields. Unlike `Map` (which is schema-free), a `Struct` declares its field names and their types, enabling schema validation, e.g. `Struct({"name": String, "age": Int32})`.
16
+
17
+
For a complete reference with examples, see [Type System](../../reference/type-system.md).
9
18
10
19
Each feature or schema field in Feast is associated with a data type, which is stored in Feast's [registry](registry.md). These types are also used to ensure that Feast operates on values correctly (e.g. making sure that timestamp columns used for [point-in-time correct joins](point-in-time-joins.md) actually have the timestamp type).
11
20
12
-
As a result, each system that feast interacts with needs a way to translate data types from the native platform, into a feast type. E.g., Snowflake SQL types are converted to Feast types [here](https://rtd.feast.dev/en/master/feast.html#feast.type_map.snowflake_python_type_to_feast_value_type). The onus is therefore on authors of offline or online store connectors to make sure that this type mapping happens correctly.
21
+
As a result, each system that Feast interacts with needs a way to translate data types from the native platform into a Feast type. E.g., Snowflake SQL types are converted to Feast types [here](https://rtd.feast.dev/en/master/feast.html#feast.type_map.snowflake_python_type_to_feast_value_type). The onus is therefore on authors of offline or online store connectors to make sure that this type mapping happens correctly.
22
+
23
+
### Backend Type Mapping for Complex Types
24
+
25
+
Map, JSON, and Struct types are supported across all major Feast backends:
**Note**: When the backend native type is ambiguous (e.g., `jsonb` could be `Map`, `Json`, or `Struct`), the **schema-declared Feast type takes precedence**. The backend-to-Feast type mappings above are only used for schema inference when no explicit type is provided.
13
47
14
48
**Note**: Feast currently does *not* support a null type in its type system.
Copy file name to clipboardExpand all lines: docs/getting-started/concepts/feature-view.md
+38Lines changed: 38 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,6 +24,7 @@ Feature views consist of:
24
24
* (optional, but recommended) a schema specifying one or more [features](feature-view.md#field) (without this, Feast will infer the schema by reading from the data source)
25
25
* (optional, but recommended) metadata (for example, description, or other free-form metadata via `tags`)
26
26
* (optional) a TTL, which limits how far back Feast will look when generating historical datasets
27
+
* (optional) `enable_validation=True`, which enables schema validation during materialization (see [Schema Validation](#schema-validation) below)
27
28
28
29
Feature views allow Feast to model your existing feature data in a consistent way in both an offline (training) and online (serving) environment. Feature views generally contain features that are properties of a specific object, in which case that object is defined as an entity and included in the feature view.
29
30
@@ -159,6 +160,43 @@ Feature names must be unique within a [feature view](feature-view.md#feature-vie
159
160
160
161
Each field can have additional metadata associated with it, specified as key-value [tags](https://rtd.feast.dev/en/master/feast.html#feast.field.Field).
161
162
163
+
## Schema Validation
164
+
165
+
Feature views support an optional `enable_validation` parameter that enables schema validation during materialization and historical feature retrieval. When enabled, Feast verifies that:
166
+
167
+
- All declared feature columns are present in the input data.
168
+
- Column data types match the expected Feast types (mismatches are logged as warnings).
169
+
170
+
This is useful for catching data quality issues early in the pipeline. To enable it:
171
+
172
+
```python
173
+
from feast import FeatureView, Field
174
+
from feast.types import Int32, Int64, Float32, Json, Map, String, Struct
175
+
176
+
validated_fv = FeatureView(
177
+
name="validated_features",
178
+
entities=[driver],
179
+
schema=[
180
+
Field(name="trips_today", dtype=Int64),
181
+
Field(name="rating", dtype=Float32),
182
+
Field(name="preferences", dtype=Map),
183
+
Field(name="config", dtype=Json), # opaque JSON data
**JSON vs Map vs Struct**: These three complex types serve different purposes:
192
+
-**`Map`**: Schema-free dictionary with string keys. Use when the keys and values are dynamic.
193
+
-**`Json`**: Opaque JSON data stored as a string. Backends use native JSON types (`jsonb`, `VARIANT`). Use for configuration blobs or API responses where you don't need field-level typing.
194
+
-**`Struct`**: Schema-aware structured type with named, typed fields. Persisted through the registry via Field tags. Use when you know the exact structure and want type safety.
195
+
196
+
Validation is supported in all compute engines (Local, Spark, and Ray). When a required column is missing, a `ValueError` is raised. Type mismatches are logged as warnings but do not block execution, allowing for safe gradual adoption.
197
+
198
+
The `enable_validation` parameter is also available on `BatchFeatureView` and `StreamFeatureView`, as well as their respective decorators (`@batch_feature_view` and `@stream_feature_view`).
199
+
162
200
## \[Alpha] On demand feature views
163
201
164
202
On demand feature views allows data scientists to use existing features and request time data (features only available at request time) to transform and create new features. Users define python transformation logic which is executed in both the historical retrieval and online retrieval paths.
Note that this mapping is non-injective, that is more than one Pandas type may corresponds to one Feast type (but not vice versa). In these cases, when converting Feast values to Pandas, the **first** Pandas type in the table above is used.
54
60
@@ -78,6 +84,12 @@ Here's how Feast types map to BigQuery types when using BigQuery for offline sto
78
84
| DOUBLE\_LIST |`ARRAY<FLOAT64>`|
79
85
| FLOAT\_LIST |`ARRAY<FLOAT64>`|
80
86
| BOOL\_LIST |`ARRAY<BOOL>`|
87
+
| MAP |`JSON` / `STRUCT`|
88
+
| MAP\_LIST |`ARRAY<JSON>` / `ARRAY<STRUCT>`|
89
+
| JSON |`JSON`|
90
+
| JSON\_LIST |`ARRAY<JSON>`|
91
+
| STRUCT |`STRUCT` / `RECORD`|
92
+
| STRUCT\_LIST |`ARRAY<STRUCT>`|
81
93
82
94
Values that are not specified by the table above will cause an error on conversion.
Here's how Feast types map to Redshift types when using Redshift for offline storage:
114
+
115
+
| Feast Type | Redshift Type |
116
+
|-------------|--|
117
+
| Event Timestamp |`TIMESTAMP` / `TIMESTAMPTZ`|
118
+
| BYTES |`VARBYTE`|
119
+
| STRING |`VARCHAR`|
120
+
| INT32 |`INT4` / `SMALLINT`|
121
+
| INT64 |`INT8` / `BIGINT`|
122
+
| DOUBLE |`FLOAT8` / `DOUBLE PRECISION`|
123
+
| FLOAT |`FLOAT4` / `REAL`|
124
+
| BOOL |`BOOL`|
125
+
| MAP |`SUPER`|
126
+
| JSON |`json` / `SUPER`|
127
+
128
+
Note: Redshift's `SUPER` type stores semi-structured JSON data. During materialization, Feast automatically handles `SUPER` columns that are exported as JSON strings by parsing them back into Python dictionaries before converting to `MAP` proto values.
0 commit comments