[SPARK-52601][SQL] Support primitive types in TransformingEncoder#51313
Closed
eejbyfeldt wants to merge 1 commit intoapache:masterfrom
Closed
[SPARK-52601][SQL] Support primitive types in TransformingEncoder#51313eejbyfeldt wants to merge 1 commit intoapache:masterfrom
eejbyfeldt wants to merge 1 commit intoapache:masterfrom
Conversation
Contributor
Author
|
@hvanhovell You reviewed #50023 where |
Contributor
|
Merged to master! |
Contributor
|
@eejbyfeldt thanks for doing this! Can you create a backport for Spark 4.0? |
eejbyfeldt
added a commit
to eejbyfeldt/spark
that referenced
this pull request
Sep 16, 2025
Support defining TransformingEncoder that has a primitive type as the input type. To support defining TransformingEncoder that has a primitive type as the input type. This came up for me when using a Scala 3 opaque type around a Long as a timestamp but wating have the encoder encode it as a timestamp. Ideally Spark would have some way of encoding a micro second timestamp without going through a java.sql.Timestamp or java.time.Instant. But this at least makes it possible to achive something similar (but less efficient) by defining a TransformingEncoder that takes a Long and returns a java.sql.Timestamp. Yes, it allows TransformingEncoder to be used in more cases. New and existing unit tests. No Closes apache#51313 from eejbyfeldt/SPARK-52601. Authored-by: Emil Ejbyfeldt <emil.ejbyfeldt@choreograph.com> Signed-off-by: Herman van Hovell <herman@databricks.com>
Contributor
Author
Created the backport here #52354 |
dongjoon-hyun
pushed a commit
that referenced
this pull request
Sep 16, 2025
Backport of #51313 to 4.0 branch. ### What changes were proposed in this pull request? Support defining TransformingEncoder that has a primitive type as the input type. ### Why are the changes needed? This came up for me when using a Scala 3 opaque type around a Long as a timestamp but wating have the encoder encode it as a timestamp. Ideally Spark would have some way of encoding a micro second timestamp without going through a java.sql.Timestamp or java.time.Instant. But this at least makes it possible to achive something similar (but less efficient) by defining a TransformingEncoder that takes a Long and returns a java.sql.Timestamp. ### Does this PR introduce _any_ user-facing change? Yes, it allows TransformingEncoder to be used in more cases. ### How was this patch tested? New and existing unit tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #52354 from eejbyfeldt/SPARK-52601-4.0. Authored-by: Emil Ejbyfeldt <emil.ejbyfeldt@choreograph.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
zifeif2
pushed a commit
to zifeif2/spark
that referenced
this pull request
Nov 14, 2025
Backport of apache#51313 to 4.0 branch. ### What changes were proposed in this pull request? Support defining TransformingEncoder that has a primitive type as the input type. ### Why are the changes needed? This came up for me when using a Scala 3 opaque type around a Long as a timestamp but wating have the encoder encode it as a timestamp. Ideally Spark would have some way of encoding a micro second timestamp without going through a java.sql.Timestamp or java.time.Instant. But this at least makes it possible to achive something similar (but less efficient) by defining a TransformingEncoder that takes a Long and returns a java.sql.Timestamp. ### Does this PR introduce _any_ user-facing change? Yes, it allows TransformingEncoder to be used in more cases. ### How was this patch tested? New and existing unit tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#52354 from eejbyfeldt/SPARK-52601-4.0. Authored-by: Emil Ejbyfeldt <emil.ejbyfeldt@choreograph.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
huangxiaopingRD
pushed a commit
to huangxiaopingRD/spark
that referenced
this pull request
Nov 25, 2025
### What changes were proposed in this pull request? Support defining TransformingEncoder that has a primitive type as the input type. ### Why are the changes needed? To support defining TransformingEncoder that has a primitive type as the input type. This came up for me when using a Scala 3 opaque type around a Long as a timestamp but wating have the encoder encode it as a timestamp. Ideally Spark would have some way of encoding a micro second timestamp without going through a java.sql.Timestamp or java.time.Instant. But this at least makes it possible to achive something similar (but less efficient) by defining a TransformingEncoder that takes a Long and returns a java.sql.Timestamp. ### Does this PR introduce _any_ user-facing change? Yes, it allows TransformingEncoder to be used in more cases. ### How was this patch tested? New and existing unit tests. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#51313 from eejbyfeldt/SPARK-52601. Authored-by: Emil Ejbyfeldt <emil.ejbyfeldt@choreograph.com> Signed-off-by: Herman van Hovell <herman@databricks.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Support defining TransformingEncoder that has a primitive type as the input type.
Why are the changes needed?
To support defining TransformingEncoder that has a primitive type as the input type.
This came up for me when using a Scala 3 opaque type around a Long as a timestamp but wating have the encoder encode it as a timestamp. Ideally Spark would have some way of encoding a micro second timestamp without going through a java.sql.Timestamp or java.time.Instant. But this at least makes it possible to achive something similar (but less efficient) by defining a TransformingEncoder that takes a Long and returns a java.sql.Timestamp.
Does this PR introduce any user-facing change?
Yes, it allows TransformingEncoder to be used in more cases.
How was this patch tested?
New and existing unit tests.
Was this patch authored or co-authored using generative AI tooling?
No