Skip to content

On demand feature views (ODFVs) should use support python dicts #2261

@adchia

Description

@adchia

In some test benchmarks, using regular python dicts for inputs for executing the transformations is much faster (up to ~10x) than pandas for the online flow. This tends to be the more latency sensitive flow (offline flows seem to be ~40% slower if using vectorized operations).

Something that looks like:

@on_demand_feature_view(
    sources=[driver_hourly_stats_view, val_to_add_request],
    schema=[
        Field(name="conv_rate_plus_val1", dtype=Float64),
        Field(name="conv_rate_plus_val2", dtype=Float64),
    ],
    mode="python"
)
def transformed_conv_rate(driver_hourly_stats: Dict[str, Any], vals_to_add: Dict[str, Any]) -> Dict[str, Any]:
    features = {}
    features['conv_rate_plus_val1'] = (driver_hourly_stats['conv_rate'] + vals_to_add['val_to_add'])
    features['conv_rate_plus_val2'] = (driver_hourly_stats['conv_rate'] + vals_to_add['val_to_add_2'])
    return features

might be similar to what we want

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions