Add missing SessionContext utility methods#1475
Conversation
Expose upstream DataFusion v53 utility methods: session_start_time, enable_ident_normalization, parse_sql_expr, execute_logical_plan, refresh_catalogs, remove_optimizer_rule, and table_provider. The add_optimizer_rule and add_analyzer_rule methods are omitted as the OptimizerRule and AnalyzerRule traits are not yet exposed to Python. Closes apache#1459. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR exposes additional SessionContext utility/introspection methods in the datafusion-python API to match capabilities available in upstream DataFusion v53 (Issue #1459), and adds unit tests to cover the new Python surface area.
Changes:
- Added Python
SessionContextwrappers for:session_start_time,enable_ident_normalization,parse_sql_expr,execute_logical_plan,refresh_catalogs,remove_optimizer_rule, andtable_provider. - Added corresponding Rust binding methods on
PySessionContextto call into DataFusion v53 APIs. - Added unit tests validating the new Python methods.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
python/datafusion/context.py |
Adds new SessionContext methods to the public Python API and wraps internal bindings (Expr, DataFrame, Table). |
crates/core/src/context.rs |
Exposes the underlying DataFusion SessionContext methods via the PyO3 PySessionContext bindings. |
python/tests/test_context.py |
Adds tests for the newly exposed SessionContext methods. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| from datafusion.expr import Expr # noqa: PLC0415 | ||
|
|
||
| return Expr(self.ctx.parse_sql_expr(sql, schema)) |
There was a problem hiding this comment.
I think we could remove the import and the wrapping with Expr.
There was a problem hiding this comment.
If we do that then we get the unwrapped inner PyExpr, which wouldn't be usable later on.
| from datafusion.catalog import Table # noqa: PLC0415 | ||
|
|
||
| return Table(self.ctx.table_provider(name)) |
There was a problem hiding this comment.
Also here I think we can remove the Table and the import.
There was a problem hiding this comment.
Same, I think we need the wrapper.
- Improve docstring examples to show actual output instead of asserts - Use doctest +SKIP for non-deterministic session_start_time output - Fix table_provider error mapping: outer async error is now RuntimeError - Strengthen tests: validate RFC 3339 with fromisoformat, test both optimizer rule removal paths, exact string match for parse_sql_expr, verify enable_ident_normalization with dynamic state change Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
datetime.fromisoformat() only supports up to 6 fractional-second digits (microseconds) on Python 3.10. Truncate nanosecond precision before parsing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Which issue does this PR close?
Closes #1459
Rationale for this change
These methods exist in the upstream repository but have not been exposed to Python.
What changes are included in this PR?
Add methods to the Python API
Add unit tests
Are there any user-facing changes?
New addition only.