Skip to content

Add Arrow IPC format for ad-hoc queries.#3791

Merged
gz merged 1 commit into
mainfrom
arrow-ipc-query-format
Mar 28, 2025
Merged

Add Arrow IPC format for ad-hoc queries.#3791
gz merged 1 commit into
mainfrom
arrow-ipc-query-format

Conversation

@gz
Copy link
Copy Markdown
Contributor

@gz gz commented Mar 27, 2025

JSON is unable to represent some datatypes we support like MAP with non-string keys so the serialization to JSON fails (#3546).

This adds a new query format for arrow ipc to the backend and fda. Using this formats the queries also work for map types with non-string keys.

Also adds support for parquet format to fda which was previously missing.

@gz
Copy link
Copy Markdown
Contributor Author

gz commented Mar 27, 2025

@Karakatiza666 we should switch the web-console to use the arrow-ipc format instead of JSON to really fix #3564. I'll create a separate issue.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for the Arrow IPC format for ad-hoc queries, along with adding Parquet output support for fda. Key changes include:

  • Adding a new ArrowIpc variant to the AdHocResultFormat enum and its Display implementation.
  • Extending fda to handle Arrow IPC and Parquet outputs in multiple commands.
  • Updating the CLI to include new output format options and adding the Arrow dependency.

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
crates/feldera-types/src/query.rs Added new ArrowIpc variant with corresponding Display mapping.
crates/fda/src/shell.rs Extended output format matching to include Arrow IPC and Parquet.
crates/fda/src/main.rs Updated query/pipeline actions to support Arrow IPC and Parquet.
crates/fda/src/cli.rs Added new CLI output format options with a Display impl.
crates/fda/Cargo.toml Included the new Arrow dependency.
crates/adapters/src/adhoc/mod.rs Integrated StreamWriter to stream Arrow IPC responses.
Files not reviewed (1)
  • openapi.json: Language not supported
Comments suppressed due to low confidence (1)

crates/fda/src/shell.rs:153

  • [nitpick] There's an inconsistency in the string representation for the ArrowIpc format across the codebase. In query.rs the variant maps to 'arrow_ipc', so consider aligning the representation to avoid confusion.
OutputFormat::ArrowIpc => "arrow",

Comment thread crates/fda/src/cli.rs Outdated
@gz gz force-pushed the arrow-ipc-query-format branch from 8dcdf3c to 3117ee9 Compare March 27, 2025 18:25
Comment thread crates/fda/src/main.rs Outdated
Comment thread crates/fda/src/main.rs
Comment thread crates/adapters/src/adhoc/mod.rs
@mihaibudiu
Copy link
Copy Markdown
Contributor

Hey, how about the automatic review?

@gz gz force-pushed the arrow-ipc-query-format branch from 3117ee9 to 4bcd65b Compare March 27, 2025 18:45
JSON is unable to represent some datatypes we support
like MAP with non-string keys so the serialization to JSON
fails (#3546).

This adds a new query format for arrow ipc to the backend and
fda. Using this formats the queries also work for map types
with non-string keys.

Also adds support for parquet format to fda which was previously
missing.

Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
@gz gz force-pushed the arrow-ipc-query-format branch from 4bcd65b to 105046f Compare March 27, 2025 18:49
@gz gz added this pull request to the merge queue Mar 27, 2025
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Mar 27, 2025
@gz gz added this pull request to the merge queue Mar 28, 2025
Merged via the queue into main with commit 382dd88 Mar 28, 2025
@gz gz deleted the arrow-ipc-query-format branch March 28, 2025 01:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants