feat: pass calling SessionContext to Python UDTF callbacks#1555
Open
timsaucer wants to merge 4 commits into
Open
feat: pass calling SessionContext to Python UDTF callbacks#1555timsaucer wants to merge 4 commits into
timsaucer wants to merge 4 commits into
Conversation
DataFusion 53 added `TableFunctionImpl::call_with_args(TableFunctionArgs)` where `TableFunctionArgs` carries both the positional expression arguments and the calling `&dyn Session`. The pure-Python UDTF path previously discarded everything but the exprs. Thread the session through when the user callback's signature opts in by declaring a `session` keyword parameter (or `**kwargs`). At call time we downcast the `&dyn Session` to its canonical `SessionState` impl and build a fresh `SessionContext` over the same Arc-shared state, exposed to Python as a `datafusion.SessionContext` wrapper. Existing callbacks whose signatures do not declare `session` continue to be called with the positional expression arguments only — no behavior change for current users. Note: a UDTF body cannot drive a fresh `ctx.sql(...).collect()` on the passed-in session because the outer SQL execution already holds the tokio runtime. Use the session for metadata access (catalogs, UDF lookups, config) rather than nested DataFrame collection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The doc comment implied a foreign FFI session was a real input. No current path reaches a pure-Python UDTF with a non-SessionState session: the SQL planner and __call__ both hand a SessionState, and a ForeignSession would only arrive via FFI-export of the UDTF, which datafusion-python does not do. Reword to state the guard is defensive and rewrap the error string. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
11 tasks
ntjohnson1
approved these changes
May 28, 2026
Contributor
ntjohnson1
left a comment
There was a problem hiding this comment.
LGTM the accepts_session bit feels a little bit like a weird work around. Motivation makes sense and implementation is reasonable.
| impl PyTableFunction { | ||
| #[new] | ||
| #[pyo3(signature=(name, func, session))] | ||
| #[pyo3(signature=(name, func, session, accepts_session=false))] |
Contributor
There was a problem hiding this comment.
Since this is an internal class I wonder if we need this extra argument. It feels like potentially the presence of the session could already encode this, but maybe that's more confusing than just adding the extra argument.
Replaces signature sniffing with an explicit ``with_session=True`` kwarg on ``TableFunction`` / ``udtf``. Avoids name-based detection footguns (positional-only ``session`` params, accidental ``**kwargs`` opt-in, shadowing by unrelated params) and makes author intent visible at registration. Also documents the feature in the UDTF user guide. Rust field renamed ``accepts_session`` -> ``inject_session_on_call`` to match the Python-side opt-in semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Raise TypeError when with_session=True is combined with an FFI-exported table function (one exposing __datafusion_table_function__). The Rust FFI branch does not consult the flag, so it would silently be dropped; guard both TableFunction.__init__ and the udtf() convenience entry. Qualify the doc claim that mutations through the injected session propagate to the caller: registry mutations do (shared Arc registries), but config changes do not (SessionConfig is cloned). Mirror the caveat in TableFunction.__init__ per the user-guide caveats convention. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Related to #1533
Rationale for this change
Upstream DataFusion deprecated
call()on table functions so that we could support far more features. Instead we should usecall_with_args. That is already being done inmainusing default table function arguments. This PR exposes the full functionality of the function signature.What changes are included in this PR?
Are there any user-facing changes?
Yes, this changes the expected arguments to pure python table functions.