Skip to content

python: add support for adhoc query as pyarrow table#5814

Open
monochromatti wants to merge 3 commits intofeldera:mainfrom
monochromatti:arrow-ipc-sdk
Open

python: add support for adhoc query as pyarrow table#5814
monochromatti wants to merge 3 commits intofeldera:mainfrom
monochromatti:arrow-ipc-sdk

Conversation

@monochromatti
Copy link
Copy Markdown

@monochromatti monochromatti commented Mar 13, 2026

Ran tests locally against a running Feldera API.

From python/:

  • Full Python SDK suite (excluding tests/runtime_aggtest):
    • uv run python -m pytest tests/ --ignore=tests/runtime_aggtest -ra
    • Local result: 122 passed, 45 skipped
  • Targeted reruns:
    • uv run python -m pytest tests/platform/test_shared_pipeline.py::TestPipeline::test_adhoc_query_arrow -q
    • uv run python -m pytest tests/unit/test_query_as_arrow.py -q

Checklist

  • Unit tests added/updated
  • Integration tests added/updated
  • Documentation updated
  • Changelog updated

Breaking Changes?

Mark if you think the answer is yes for any of these components:

Describe Incompatible Changes

None.


Summary

This PR adds Arrow IPC query support to the Python SDK so ad-hoc query results can be consumed as streamed PyArrow record batches.

What changed

  • Added a new client API:
    • FelderaClient.query_as_arrow(pipeline_name, query) -> Generator[pyarrow.RecordBatch, None, None]
  • Added a pipeline convenience method:
    • Pipeline.query_arrow(query) -> Generator[pyarrow.RecordBatch, None, None]
  • Added optional Arrow dependency extra:
    • pip install "feldera[arrow]"
  • Updated Python README with Arrow installation guidance
  • Added unit and platform tests for Arrow IPC query behavior

Notes

  • The Arrow response is consumed from an HTTP stream (stream=True) and yielded batch-by-batch.
  • Users can materialize a pyarrow.Table when desired via pyarrow.Table.from_batches(...).

@monochromatti monochromatti force-pushed the arrow-ipc-sdk branch 2 times, most recently from 4065f37 to edcaa7e Compare March 13, 2026 06:54
Copy link
Copy Markdown

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — but see inline: there is an existing open PR covering the same feature.

@gz
Copy link
Copy Markdown
Contributor

gz commented Mar 13, 2026

hi @monochromatti this looks good thanks a lot for your contribution. @abhizer can you review this

@monochromatti
Copy link
Copy Markdown
Author

I'd like input on whether to return Generator[pyarrow.RecordBatch, ...] or a pyarrow.Table directly. The latter is the current state of the PR, but after some thinking it feels like generating batches is more in style with similar existing functionality and better suited for big payloads.

Copy link
Copy Markdown
Contributor

@abhizer abhizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

As a heads up, the reason we didn't merge the prior PR is because the server intermittently sent bad data and we were unable to figure out why.

@abhizer
Copy link
Copy Markdown
Contributor

abhizer commented Mar 13, 2026

I'd like input on whether to return Generator[pyarrow.RecordBatch, ...] or a pyarrow.Table directly

We normally return a generator, and it might be a good idea to keep this behavior consistent.

@mihaibudiu
Copy link
Copy Markdown
Contributor

@monochromatti please re-request a review from @abhizer when this is ready again

@monochromatti monochromatti force-pushed the arrow-ipc-sdk branch 2 times, most recently from 379bfe8 to 5f06e6a Compare March 24, 2026 12:16
@monochromatti monochromatti requested a review from abhizer March 24, 2026 12:18
Copy link
Copy Markdown
Contributor

@abhizer abhizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@abhizer abhizer changed the title arrow ipc sdk python: add support for adhoc query as pyarrow table Apr 2, 2026
@monochromatti
Copy link
Copy Markdown
Author

Rebased on main to solve a uv.lock conflict

Copy link
Copy Markdown

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@monochromatti
Copy link
Copy Markdown
Author

Sorry I might be missing something, but the PR still requires an approval to run CI?

@abhizer
Copy link
Copy Markdown
Contributor

abhizer commented Apr 4, 2026

Done!

@mythical-fred
Copy link
Copy Markdown

The "Pre Merge Queue Tasks" CI failure looks transient — the failing step is the Rust build check, but this PR has no Rust changes. The same step failed and then passed for other PRs around the same time. Could someone re-trigger CI?

@mythical-fred
Copy link
Copy Markdown

CI is still showing a failure on "Pre Merge Queue Tasks" from Apr 4 — looks like nobody re-triggered it yet. Could someone queue a fresh run? This is a Python-only PR and that step has been transiently failing for unrelated Rust check reasons.

@abhizer
Copy link
Copy Markdown
Contributor

abhizer commented Apr 6, 2026

You might have to run "ruff format" for it to pass the pre merge queue.

Copy link
Copy Markdown

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Mattias Matthiesen <mattias.matthiesen@eviny.no>
Signed-off-by: Mattias Matthiesen <mattias.matthiesen@eviny.no>
Signed-off-by: Mattias Matthiesen <mattias.matthiesen@eviny.no>
@monochromatti
Copy link
Copy Markdown
Author

monochromatti commented Apr 8, 2026

Updated PR body and solved uv.lock conflict (exclude-newer timestamp). @abhizer

@monochromatti monochromatti requested a review from abhizer April 8, 2026 05:45
@abhizer
Copy link
Copy Markdown
Contributor

abhizer commented Apr 8, 2026

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants