Skip to content

fix: preserve write_default when applying a name mapping#3601

Open
anxkhn wants to merge 1 commit into
apache:mainfrom
anxkhn:patch-17
Open

fix: preserve write_default when applying a name mapping#3601
anxkhn wants to merge 1 commit into
apache:mainfrom
anxkhn:patch-17

Conversation

@anxkhn

@anxkhn anxkhn commented Jul 5, 2026

Copy link
Copy Markdown

Rationale for this change

NameMappingProjectionVisitor.field (the visitor behind apply_name_mapping)
rebuilds each NestedField, but it passed the field's write default under the
keyword initial_write=:

return NestedField(
    field_id=field_partner.field_id,
    name=field.name,
    field_type=field_result,
    required=field.required,
    doc=field.doc,
    initial_default=field.initial_default,
    initial_write=field.write_default,   # <- no such parameter
)

NestedField.__init__ has no initial_write parameter; its write-default
keyword is write_default. Because NestedField is a Pydantic model, the
unknown initial_write key is absorbed by the **data catch-all and silently
ignored, so apply_name_mapping dropped every field's write_default to
None (while initial_default was preserved). Every other place that rebuilds
a NestedField (for example in pyiceberg/table/update/schema.py) already uses
write_default=; this was the one call site out of step with the constructor.

The fix passes write_default=field.write_default to match the constructor
contract.

Are these changes tested?

Yes. Added test_mapping_preserves_field_defaults in
tests/table/test_name_mapping.py, which builds a schema whose field carries
both initial_default and write_default, runs it through
apply_name_mapping, and asserts both values survive on the mapped field. It
fails before the change (write_default comes back None) and passes after.

  • tests/table/test_name_mapping.py: 13 passed
  • tests/io/test_pyarrow_visitor.py and tests/test_schema.py: 388 passed
    (the PyArrow name-mapping visitor and schema, the current caller surface)

Integration tests (Docker + Spark) were not run in this environment; the change
is pure Python type-plumbing fully exercised by the unit tests above.

Are there any user-facing changes?

No observable change in the current code paths. The only caller today is
pyarrow_to_schema -> apply_name_mapping (pyiceberg/io/pyarrow.py), which
feeds a PyArrow-derived schema whose fields never carry an Iceberg
write_default, so nothing is dropped in practice right now. This is a latent
correctness fix: it prevents write_default from being silently discarded the
moment a name mapping is applied to any schema whose fields set a write default.

NameMappingProjectionVisitor.field rebuilt each NestedField with an
initial_write= keyword, but NestedField.__init__ has no such parameter
(the write-default keyword is write_default). Being a Pydantic model,
the unknown key was swallowed by **data and silently ignored, so
apply_name_mapping dropped every field's write_default to None.

Pass write_default= to match the NestedField constructor and add a
regression test asserting both initial_default and write_default
survive a name mapping.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants