Expected Behavior
Spark engine when enabled should materialize features to the online store.
Current Behavior
Materialization fails with the following error:
pyspark.errors.exceptions.base.PySparkTypeError: [CANNOT_INFER_SCHEMA_FOR_TYPE] Can not infer schema for type: `ChunkedArray`.
Steps to reproduce
See this repo that should reproduce the error locally for you. Ensure your JAVA_HOME is pointing to a java 17 installation and run make run to run the workflow. Optionally inspect the Makefile and run the commands yourself.
The repo has the default feast init repo created and will run a feast plan, feast apply and finally a materialize.py script which should trigger a Spark materialization job.
Specifications
- Version: 0.53.0
- Platform: Python 3.12, PySpark 3.5.5, Java 17
- Subsystem: MacOS 15.6.1
Possible Solution
I believe this happens when an arrow dataframe is converted to a Spark dataframe. Why this happens I am unsure, it may be my configuration or it may be a bug. If someone could advise on a solution that would be great.
Thanks!
Expected Behavior
Spark engine when enabled should materialize features to the online store.
Current Behavior
Materialization fails with the following error:
Steps to reproduce
See this repo that should reproduce the error locally for you. Ensure your
JAVA_HOMEis pointing to a java 17 installation and runmake runto run the workflow. Optionally inspect the Makefile and run the commands yourself.The repo has the default
feast initrepo created and will run afeast plan,feast applyand finally amaterialize.pyscript which should trigger a Spark materialization job.Specifications
Possible Solution
I believe this happens when an arrow dataframe is converted to a Spark dataframe. Why this happens I am unsure, it may be my configuration or it may be a bug. If someone could advise on a solution that would be great.
Thanks!