Skip to content

[kafka_actions] Fix Avro deserialization for schemas with logical types#23224

Merged
piochelepiotr merged 4 commits intomasterfrom
piotr.wolski/fix-avro-logical-type-serialization
Apr 9, 2026
Merged

[kafka_actions] Fix Avro deserialization for schemas with logical types#23224
piochelepiotr merged 4 commits intomasterfrom
piotr.wolski/fix-avro-logical-type-serialization

Conversation

@piochelepiotr
Copy link
Copy Markdown
Contributor

Summary

  • fastavro.schemaless_reader returns Python-native types for Avro logical types (decimal.Decimal, datetime.datetime, datetime.date, uuid.UUID, bytes), but json.dumps() in _deserialize_avro cannot serialize any of them
  • This caused "Failed to deserialize Avro message: Object of type Decimal is not JSON serializable" errors for schemas using bytes/decimal, timestamp-millis, date, or uuid logical types — both with inline schemas and Schema Registry
  • Adds _AvroJSONEncoder custom encoder that handles all fastavro logical type return values
  • Adds test covering all affected logical types (decimal, timestamp-millis, date, uuid)

Test plan

  • New test test_avro_logical_types_decimal_timestamp_uuid passes
  • All 26 existing deserializer tests pass
  • Manually verify with a real Kafka topic that has bytes/decimal fields (e.g. enriched_spend_events)

🤖 Generated with Claude Code

fastavro.schemaless_reader returns Python-native types for Avro logical
types (decimal.Decimal, datetime, date, uuid.UUID, bytes), but
json.dumps() cannot serialize these. This caused "Failed to deserialize
Avro message" errors for any schema using bytes/decimal, timestamp-millis,
date, or uuid logical types — including when using Schema Registry.

Add a custom JSON encoder (_AvroJSONEncoder) that handles all fastavro
logical type return values.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
piochelepiotr and others added 2 commits April 8, 2026 15:31
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f6b532c959

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


def default(self, obj):
if isinstance(obj, decimal.Decimal):
return float(obj)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid lossy float conversion for Avro decimals

_AvroJSONEncoder.default now converts decimal.Decimal to float, which silently loses precision for many valid Avro decimal values (especially high-precision/large-scale amounts) and can change numeric values in emitted JSON. This commit fixes deserialization failures, but in financial or ID-like decimal fields it can now produce incorrect data rather than an explicit error; serializing decimals as strings would preserve exact values.

Useful? React with 👍 / 👎.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 8, 2026

Codecov Report

❌ Patch coverage is 88.57143% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.79%. Comparing base (32a22f7) to head (d462c51).
⚠️ Report is 17 commits behind head on master.

Additional details and impacted files
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

float() silently loses precision for high-precision Avro decimal values.
str() preserves exact values, and the filter's _coerce_types already
handles str-to-number conversion for > / < comparisons.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@piochelepiotr piochelepiotr added this pull request to the merge queue Apr 9, 2026
Merged via the queue into master with commit 613bbb8 Apr 9, 2026
31 checks passed
@piochelepiotr piochelepiotr deleted the piotr.wolski/fix-avro-logical-type-serialization branch April 9, 2026 14:21
@dd-octo-sts dd-octo-sts Bot added this to the 7.79.0 milestone Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants