fix: preserve write_default when applying a name mapping#3601
Open
anxkhn wants to merge 1 commit into
Open
Conversation
NameMappingProjectionVisitor.field rebuilt each NestedField with an initial_write= keyword, but NestedField.__init__ has no such parameter (the write-default keyword is write_default). Being a Pydantic model, the unknown key was swallowed by **data and silently ignored, so apply_name_mapping dropped every field's write_default to None. Pass write_default= to match the NestedField constructor and add a regression test asserting both initial_default and write_default survive a name mapping.
ebyhr
approved these changes
Jul 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rationale for this change
NameMappingProjectionVisitor.field(the visitor behindapply_name_mapping)rebuilds each
NestedField, but it passed the field's write default under thekeyword
initial_write=:NestedField.__init__has noinitial_writeparameter; its write-defaultkeyword is
write_default. BecauseNestedFieldis a Pydantic model, theunknown
initial_writekey is absorbed by the**datacatch-all and silentlyignored, so
apply_name_mappingdropped every field'swrite_defaulttoNone(whileinitial_defaultwas preserved). Every other place that rebuildsa
NestedField(for example inpyiceberg/table/update/schema.py) already useswrite_default=; this was the one call site out of step with the constructor.The fix passes
write_default=field.write_defaultto match the constructorcontract.
Are these changes tested?
Yes. Added
test_mapping_preserves_field_defaultsintests/table/test_name_mapping.py, which builds a schema whose field carriesboth
initial_defaultandwrite_default, runs it throughapply_name_mapping, and asserts both values survive on the mapped field. Itfails before the change (
write_defaultcomes backNone) and passes after.tests/table/test_name_mapping.py: 13 passedtests/io/test_pyarrow_visitor.pyandtests/test_schema.py: 388 passed(the PyArrow name-mapping visitor and schema, the current caller surface)
Integration tests (Docker + Spark) were not run in this environment; the change
is pure Python type-plumbing fully exercised by the unit tests above.
Are there any user-facing changes?
No observable change in the current code paths. The only caller today is
pyarrow_to_schema->apply_name_mapping(pyiceberg/io/pyarrow.py), whichfeeds a PyArrow-derived schema whose fields never carry an Iceberg
write_default, so nothing is dropped in practice right now. This is a latentcorrectness fix: it prevents
write_defaultfrom being silently discarded themoment a name mapping is applied to any schema whose fields set a write default.