perf: Replace MessageToDict with optimized custom dict builder#6015
Conversation
6a38f3c to
2b81774
Compare
2b81774 to
8fbda9f
Compare
8fbda9f to
f3ea953
Compare
06877fd to
063eeb2
Compare
d3c512d to
c06a75d
Compare
c06a75d to
53a5264
Compare
2616fb8 to
300be07
Compare
300be07 to
a8c7653
Compare
| } | ||
|
|
||
|
|
||
| def response_to_dict_fast(response: GetOnlineFeaturesResponse) -> Dict[str, Any]: |
There was a problem hiding this comment.
why not response_to_dict? better yet, convert_response_to_dict is a better description
There was a problem hiding this comment.
Good call - renamed to convert_response_to_dict. The _fast suffix was leaking implementation details ✅
a8c7653 to
9859bad
Compare
bd8f618 to
eb7547c
Compare
|
@abhijeet-dhumal please resolve conflicts and update PR |
Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
…nt FieldValues Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
… and precision, add base64 encoding for bytes Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
eb7547c to
0c0637b
Compare
| which = v.WhichOneof("val") | ||
| if which is None or which == "null_val": | ||
| return None | ||
| elif "_list_" in which: | ||
| return list(getattr(v, which).val) | ||
| else: | ||
| return getattr(v, which) |
There was a problem hiding this comment.
🔴 bytes_val values cause ORJSONResponse to raise TypeError
_value_to_native at feature_server_utils.py:68 returns raw bytes for bytes_val fields. Even if orjson were installed, orjson.dumps() cannot serialize Python bytes objects and raises TypeError: Type is not JSON serializable: bytes. The old code path returned the dict to FastAPI which processed it through jsonable_encoder (capable of handling bytes). The switch to ORJSONResponse bypasses that encoder, causing a regression for any response containing bytes-typed feature values.
Prompt for agents
In _value_to_native (feature_server_utils.py), when the value type is bytes_val, the raw bytes object is returned. orjson cannot serialize bytes. You need to either: (1) base64-encode bytes values before returning them (e.g. using base64.b64encode(val).decode('ascii')), or (2) not use ORJSONResponse and instead use a serializer that handles bytes. Note that bytes_list_val (handled by the _list_ branch) would also produce a list of bytes objects with the same issue.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
@abhijeet-dhumal bytes_val is still problem since you are using JSONResponse
There was a problem hiding this comment.
Thanks for catching this, Nikhil! Now bytes_val and bytes_list_val now base64-encode the output before returning, so JSONResponse / json.dumps can serialize them without TypeError.
There was a problem hiding this comment.
Please review!
|
@ntkathole Apologies for the late reply here .. I have fixed and addressed the open reviews! |
| from fastapi.concurrency import run_in_threadpool | ||
| from fastapi.logger import logger | ||
| from fastapi.responses import JSONResponse | ||
| from fastapi.responses import JSONResponse, ORJSONResponse |
There was a problem hiding this comment.
ORJSONResponse has been deprecated in FastAPI.
https://fastapi.tiangolo.com/reference/responses/?h=ORJSON#fastapi.responses.ORJSONResponse
Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
Previous implementation used "_list_" substring check which silently returned protobuf objects (not JSON-serializable) for: - *_set_val types (bytes_set_val, string_set_val, int64_set_val, etc.) - list_val / set_val (RepeatedValue with nested Values) - map_val / struct_val (Map<string, Value>) - map_list_val / struct_list_val (MapList) Add tests for all previously uncovered types and a float32 precision consistency test against MessageToDict. Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
…nt set-type behavior Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
8d76583 to
b2f2909
Compare
…p test Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
bytes_val returned raw bytes which cause json.dumps TypeError when JSONResponse serializes the response. Base64-encode these values to match standard protobuf JSON encoding for bytes fields. Fixes Nikhil's review comment: bytes_val is still problem since you are using JSONResponse. Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
…t-dev#6435) PR feast-dev#6015 replaced MessageToDict with a custom dict builder in feature_server.py, which incidentally fixed the crash caused by the removed float_precision kwarg in protobuf >= 7.34.0. Add explicit regression tests to prevent reintroduction: - Verify convert_response_to_dict never calls MessageToDict - Verify double precision round-trips without float_precision kwarg Fixes feast-dev#6435 Assisted-by: Claude Opus 4.6 Signed-off-by: Jonathan Wrede <wrede.jonathan00@gmail.com>
PR #6015 replaced MessageToDict with a custom dict builder in feature_server.py, which incidentally fixed the crash caused by the removed float_precision kwarg in protobuf >= 7.34.0. Add explicit regression tests to prevent reintroduction: - Verify convert_response_to_dict never calls MessageToDict - Verify double precision round-trips without float_precision kwarg Fixes #6435 Assisted-by: Claude Opus 4.6 Signed-off-by: Jonathan Wrede <wrede.jonathan00@gmail.com>
# [0.64.0](v0.63.0...v0.64.0) (2026-06-13) ### Bug Fixes * Add async_supported property to RedisOnlineStore ([9b088fe](9b088fe)) * Add missing feast init templates to operator CRD and enhance persistence documentation ([1941d4d](1941d4d)) * Allow to publish from reference branch ([5458ec8](5458ec8)) * API calls list ([4203eb7](4203eb7)) * **bigquery:** Enable list inference for parquet loads in offline_write_batch ([9243497](9243497)), closes [#5845](#5845) * Bump grpcio dependencies ([07b4782](07b4782)) * **compute-engine/local:** Honor field_mapping on join keys in dedup + join nodes ([#6395](#6395)) ([bd01824](bd01824)) * **dynamodb:** Avoid tag race condition by using diff-based tag updates ([#6479](#6479)) ([bad2b7d](bad2b7d)), closes [#6418](#6418) * **dynamodb:** Fix mypy type for _build_projection_expression return ([217b4da](217b4da)) * Fix intermittent async test failures for DynamoDB and Redis ([63c5eb1](63c5eb1)) * Fix mongodb blog title ([57d28d4](57d28d4)) * Fix shared SQL registry crash - avoid unnecessary UDF deserialization in proto cache building ([ac588d7](ac588d7)) * Fix SparkRetrievalJob.persist() failing for SparkSource ([209d7cd](209d7cd)) * Fixed formatting and image for mongo blog ([#6377](#6377)) ([f8389fb](f8389fb)) * Fixes for ray source ([7f592a4](7f592a4)) * **go:** skip registry refresh when cache_ttl_seconds <= 0 ([97ed40c](97ed40c)) * Handle array of strings columns in Athena materialization ([#6324](#6324)) ([4ed0278](4ed0278)) * make milvus VARCHAR max_length configurable, remove hardcoded 512 limit ([3b98c22](3b98c22)) * **operator:** Set appProtocol: grpc on registry gRPC Service ([#6367](#6367)) ([c9ae2b4](c9ae2b4)) * PyJWT 2.10+ added validation that rejects empty HMAC keys ([e756ffe](e756ffe)) * RemoteOnlineStore sends all features in a single HTTP request ([8f187dd](8f187dd)) * Remove registry proto dump to enforce RBAC and add permission checks to Commit/Refresh RPCs ([328431f](328431f)) * Remove selector migration job - no longer needed ([51c325e](51c325e)) * replace broken .claude skill symlink with correct relative path ([4541690](4541690)) * Replace selector label strip patch with migration Job for upgrade-safe selector uniqueness ([00dea50](00dea50)) * Scope feature view name conflict check to current project in file-based registry ([#6369](#6369)) ([a4fde83](a4fde83)), closes [#6209](#6209) * **snowflake:** Stop double-quoting connection identifiers ([#6462](#6462)) ([e914d59](e914d59)) * **spark:** S3/GCS PyArrow filesystem resolution for staging paths ([#6442](#6442)) ([ae50414](ae50414)) * **trino:** Clean up temporary entity tables after retrieval ([#6381](#6381)) ([d86b13d](d86b13d)), closes [#6306](#6306) * Update go-feature-server base image to Go 1.25 and fix operator Dockerfile COPY permissions ([86ef0bc](86ef0bc)) ### Features * [Backend] Data Quality Monitoring with native compute, multi-backend support, REST API, CLI ([#6202](#6202)) ([5458c37](5458c37)) * Add apache flink compute engine ([#6476](#6476)) ([9636d6a](9636d6a)) * Add demo noteboooks for users ([e362173](e362173)) * Add enabled/disabled toggle for feature views ([#6401](#6401)) ([5f1fa0d](5f1fa0d)), closes [#6395](#6395) * Add Label View to init template ([ec272d5](ec272d5)) * Add mTLS support to remote registry gRPC client ([#6474](#6474)) ([c9602d8](c9602d8)) * Add Prometheus gauges for FeatureStore installation telemetry ([#6354](#6354)) ([1b681b7](1b681b7)) * Adds registry REST API endpoints for managing entities, data sources, and feature views ([#6413](#6413)) ([f77bd1d](f77bd1d)) * Allow CRUD on entities, data sources, and feature views from UI ([#6412](#6412)) ([2321c07](2321c07)) * Allow default openlineage configuration ([#6467](#6467)) ([276b6df](276b6df)) * **bigquery:** Support DATE-type event timestamp columns ([#6362](#6362)) ([753dee5](753dee5)), closes [#2530](#2530) * **cli:** Add `feast projects delete` command (closes [#5095](#5095)) ([#6318](#6318)) ([1a4b96c](1a4b96c)) * Data Quality Monitoring added in feast UI ([#6422](#6422)) ([fa271be](fa271be)) * **dynamodb:** Use ProjectionExpression when requested_features is set ([0adc906](0adc906)), closes [#6058](#6058) * Enhance DataSource and FeatureView modals with error handling and submission states ([96d7169](96d7169)) * Expose registry endpoints on feature server for MCP access ([f77981c](f77981c)) * Feast First-Class LabelView Implementation ([#6292](#6292)) ([c0e7e5d](c0e7e5d)) * Feast-MLflow Integration ([#6235](#6235)) ([7279c75](7279c75)) * Operational metrics for offline store and SOX metrics for both ([#6340](#6340)) ([65b1b80](65b1b80)) * Pre-compute feature service ([8011550](8011550)) * REST API-backed UI for RBAC compatibility and per-page lazy loading ([#6414](#6414)) ([6ae80af](6ae80af)) * Support non-string map key types ([#6382](#6382)) ([#6383](#6383)) ([728aa2e](728aa2e)) * Update FeatureStore CRD with DRA Fields ([01241e4](01241e4)) ### Performance Improvements * Cache feature view resolution in get_online_features to reduce per-request overhead ([55c2f18](55c2f18)) * Optimize feature serving latency with batched async Redis, cached checks fix ([103809a](103809a)) * Replace MessageToDict with optimized custom dict builder ([#6015](#6015)) ([9902064](9902064))
Summary
Fixes #6013
The feature server uses
google.protobuf.json_format.MessageToDictfor convertingGetOnlineFeaturesResponseto JSON, but this generic converter is slow due toreflection-based field introspection.
Changes
feature_server_utils.pywithconvert_response_to_dict()functionGetOnlineFeaturesResponsestructurestatusbool field (omitted when false, matching proto3 default behavior)Performance Impact
~4x faster serialization, saving 5-15ms per request for typical feature responses
with 50-200 features.
Test Plan
test_feature_server_utils.pyMessageToDict+proto_json.patch()formatstatusfieldmake lint-python