Skip to content

Adhoc query fixes#6244

Open
gz wants to merge 5 commits into
mainfrom
adhoc-query-fixes
Open

Adhoc query fixes#6244
gz wants to merge 5 commits into
mainfrom
adhoc-query-fixes

Conversation

@gz
Copy link
Copy Markdown
Contributor

@gz gz commented May 15, 2026

Scenario fda v0.296.0 PR fda
SELECT 1/0 (text) exits 0, scripts blind exits 1, single ERROR line
SELECT * FROM not_materialized exits 0 exits 1
SELECT 1/0 (arrow_ipc) exits 1 already exits 1 (unchanged)
Parser error message sql parser error: ... at Line: 1, Column: 37 non-debug fmt error msg
INSERT;INSERT;SELECT server-side rejected before this branch allowed
SELECT 1; SELECT 2 server-side rejected with strange error rejected with clearer "only the final statement may return a result set" message
--format json invocation no warning stderr deprecation warning pointing at arrow_ipc

@gz gz requested a review from mihaibudiu May 15, 2026 05:58
Comment thread crates/adapters/src/adhoc.rs Outdated
Comment thread crates/adapters/src/adhoc.rs Outdated
Comment thread crates/adapters/src/adhoc.rs
Comment thread crates/adapters/src/adhoc.rs
Comment thread crates/adapters/src/adhoc.rs Outdated
:param query: The SQL query to be executed.
:return: A generator that yields each row of the result as a Python dictionary, deserialized from JSON.
"""
"""Internal JSON streaming used by both `query_as_json` and the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you file an issue that we migrate all users to the new format?

Comment thread python/feldera/rest/feldera_client.py Outdated
self.pipeline.query(
"INSERT INTO t1 VALUES "
"(200,'2020-02-02','11111111-1111-1111-1111-111111111111');"
"INSERT INTO t1 VALUES "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess no DELETE yet?


Wide integers (``BIGINT``) and high-precision decimals are the
cases the deprecated JSON path mishandles; we assert the values
come back exactly. Types ``ROW``, ``VARIANT``, and ``INTERVAL``
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently MAP too

Comment thread python/tests/runtime/test_adhoc_queries.py
gz added 2 commits May 15, 2026 10:10
A query that fails during execution (e.g. division by zero, querying a
non-materialized table) cannot use HTTP status codes to signal failure
in streaming mode because the response headers are already sent.

WebSocket mode closes with `CloseCode::Error` plus a preceding text
frame carrying the error details, but the fda client only logged a
warning and left the exit code at 0, hiding the failure from scripts.

Fixes #4973
DataFusion's SQL parser wraps the underlying `sqlparser::ParserError`
in a `DataFusionError::Diagnostic(DataFusionError::SQL(...))`. The
default `Display` chain printed it through `Debug` and the user saw

  Error during query processing: SQL error: ParserError("Expected: end of
  statement, found: in at Line: 1, Column: 30").

The line/column was buried inside the quoted `Debug` form, and the
`SQL error: ParserError(...)` wrapper added noise.

Unwrap the `Diagnostic` and use the parser's `Display` impl, which
already appends `at Line: X, Column: Y`, producing

  Error during query processing: sql parser error: Expected: end of
  statement, found: in at Line: 1, Column: 30.

Fixes #2526
Copy link
Copy Markdown

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — thorough work closing six issues. One nit inline.

Comment thread crates/adapters/src/adhoc.rs
gz added 2 commits May 15, 2026 10:41
JSON is lossy: it coerces wide integers to f64, cannot represent SQL
`MAP` keys that are not strings, and silently drops result columns
that share a name. Arrow IPC does not have these limitations and is
already used internally by fda for tabular display.

Python:
- Add `pyarrow` as a runtime dependency.
- Add `FelderaClient.query_as_arrow_ipc` and the matching
  `Pipeline.query_arrow_ipc` helper that yield `pyarrow.RecordBatch`
  objects over an HTTP stream from the `/query` endpoint.
- Extract the JSON streaming loop into a private
  `_query_json_stream` so `Pipeline.query()` can keep its existing
  dict-based contract without emitting a DeprecationWarning to user
  code that has not migrated yet.
- Emit a `DeprecationWarning` from `FelderaClient.query_as_json` so
  external callers are nudged toward the Arrow IPC path.

fda:
- Promote the JSON deprecation notice to a stderr warning so it is
  visible without enabling `log`-level configuration.

Tests:
- Cover the duplicate-column case from
  #4218 with an Arrow IPC
  round-trip that requires both `x` columns to survive.
- Cover wide-integer fidelity (2**53 + 1) and the common primitive
  SQL types via Arrow IPC.
- Add an `fda --format arrow_ipc query` smoke run to test.bash.

Fixes #4219
Fixex #3923
Fixes #4218
…ueries

Previously `execute_sql_with_state` rejected any non-PREPARE
statement before the final statement, so submitting
`INSERT; INSERT; SELECT` or even two INSERTs in a row required
splitting them into separate requests.

Relax that restriction: an intermediate non-result-producing
statement (DML or a DataFusion `Statement` variant such as a
prepared EXECUTE) is now executed for its side effect, its row
count is discarded, and the final statement still owns the
streaming protocol's single result slot.

A result-producing statement (SELECT and other relational
operators) in a non-final position is still rejected.
Will need to add come up with a streaming protocol to support
this.

Fixes #2487
@gz gz force-pushed the adhoc-query-fixes branch from c2560de to 19c5e73 Compare May 15, 2026 17:42
@gz gz enabled auto-merge May 15, 2026 17:43
Signed-off-by: feldera-bot <feldera-bot@feldera.com>
@gz gz added this pull request to the merge queue May 15, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants