Comparing changes

* Add statement level query tag support by introducing it as a parameter on execute* methods Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com> * Add query_tags support to executemany method - Added query_tags parameter to executemany() method - Query tags are applied to all queries in the batch - Updated example to demonstrate executemany usage with query_tags - All tests pass (122/122 client tests) Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com> * add example that doesn't have tag Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com> * fix presubmit errors Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com> * another lint Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com> * address review comments Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com> --------- Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com>

…tion (#749) * Allow specifiying query tags as a dict upon connection creation Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com> * fix comment Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com> --------- Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com>

#742) * Fix float inference to use DoubleParameter (64-bit) instead of FloatParameter (32-bit) Signed-off-by: Shubhambhusate <bhusates6@gmail.com> * Add DoubleParameter with Primitive.DOUBLE to test_inference coverage --------- Signed-off-by: Shubhambhusate <bhusates6@gmail.com>

… default (#751) Updated pyarrow-concat

* Add statement-level query_tags support for SEA backend Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com> * Simplify None handling in query_tags serialization Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com> --------- Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

…n all actions to SHA (#762) Updaed the security

The tisonkun/actions-dco action has been unreliable. Replace it with an inline bash script (matching databricks-sql-go) that checks each commit for a Signed-off-by line, provides clear per-commit feedback, and scopes the trigger to opened/synchronize/reopened events on main. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* Migrate CI to databricks-protected runners and route PyPI through JFrog Protected runners are required for Databricks OSS repos. Add a setup-jfrog composite action (OIDC-based, matching databricks-odbc) that sets PIP_INDEX_URL so all pip/poetry installs go through the JFrog PyPI proxy. Every workflow now runs on the databricks-protected-runner-group with id-token: write for the OIDC exchange. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * Add Poetry JFrog source configuration to all workflows The previous commit only set PIP_INDEX_URL, but Poetry uses its own resolver and needs explicit source configuration. Add a "Configure Poetry for JFrog" step after poetry install in every job that sets up the JFrog repository and credentials, then adds it as the primary source for the project. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * Fix step ordering: move JFrog setup after poetry install The snok/install-poetry action uses pip internally to install poetry. When PIP_INDEX_URL was set before this step, the installer tried to route through JFrog and failed with an SSL error. Move the JFrog OIDC token + PIP_INDEX_URL + poetry source configuration to run after Install Poetry but before poetry install. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * Replace snok/install-poetry with pip install through JFrog The hardened runners block direct access to install.python-poetry.org, causing snok/install-poetry to fail with SSL errors. Replace it with `pip install poetry==2.2.1` which routes through the JFrog PyPI proxy. New step ordering: checkout → setup-python → Setup JFrog (OIDC + PIP_INDEX_URL) → pip install poetry → Configure Poetry for JFrog → poetry install. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * Add poetry lock --no-update after source add to fix lock mismatch poetry source add modifies pyproject.toml, which makes poetry refuse to install from the existing lock file. Running poetry lock --no-update regenerates the lock file metadata without changing dependency versions. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * Fix poetry lock flag and YAML indentation Poetry 2.x doesn't have --no-update flag, use poetry lock instead. Also fix indentation of poetry lock in the arrow test job. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * Move JFrog setup before setup-python, matching sqlalchemy pattern Follow the proven pattern from databricks/databricks-sqlalchemy#59: checkout → Setup JFrog → setup-python → pip install poetry → poetry source add + poetry lock → poetry install. The hardened runners block pypi.org at the network level, so JFrog must be configured before actions/setup-python (which upgrades pip). Also simplified workflows by removing verbose section comments. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * Extract setup-poetry composite action to remove duplication Create .github/actions/setup-poetry that bundles JFrog setup, setup-python, poetry install via pip, JFrog source config, cache, and dependency install into a single reusable action with inputs for python-version, install-args, cache-path, and cache-suffix. All workflows now call setup-poetry instead of repeating these steps, matching the pattern from databricks/databricks-sqlalchemy#59. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> --------- Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

Add AI coding agent detection to User-Agent header Detect when the Python SQL connector is invoked by an AI coding agent (e.g. Claude Code, Cursor, Gemini CLI) by checking well-known environment variables, and append `agent/<product>` to the User-Agent string. This enables Databricks to understand how much driver usage originates from AI coding agents. Detection only succeeds when exactly one agent is detected to avoid ambiguous attribution. Mirrors the approach in databricks/cli#4287. Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…47min → 15min) (#772) * Optimize CI: consolidate workflows, fix caching, speed up e2e tests Workflow consolidation: - Delete integration.yml and daily-telemetry-e2e.yml (redundant with coverage workflow which already runs all e2e tests) - Add push-to-main trigger to coverage workflow - Run all tests (including telemetry) in single pytest invocation with --dist=loadgroup to respect xdist_group markers for isolation Fix pyarrow cache: - Remove cache-path: .venv-pyarrow from pyarrow jobs. Poetry always creates .venv regardless of the cache-path input, so the cache was never saved ("Path does not exist" error). The cache-suffix already differentiates keys between variants. Fix 3.14 post-test DNS hang: - Add enable_telemetry=False to unit test DUMMY_CONNECTION_ARGS that use server_hostname="foo". This prevents FeatureFlagsContext from making real HTTP calls to fake hosts, eliminating ~8min hang from ThreadPoolExecutor threads timing out on DNS on protected runners. Improve e2e test parallelization: - Split TestPySQLLargeQueriesSuite into 3 separate classes (TestPySQLLargeWideResultSet, TestPySQLLargeNarrowResultSet, TestPySQLLongRunningQuery) so xdist distributes them across workers instead of all landing on one. Speed up slow tests: - Reduce large result set sizes from 300MB to 100MB (still validates large fetches, lz4, chunking, row integrity) - Start test_long_running_query at scale_factor=50 instead of 1 to skip ramp-up iterations that finish instantly Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * Further optimize e2e: 4 workers, lower long-query threshold, split lz4 - Use -n 4 instead of -n auto in coverage workflow. The e2e tests are network-bound (waiting on warehouse), not CPU-bound, so 4 workers on a 2-CPU runner is fine and doubles parallelism. - Lower test_long_running_query min_duration from 3 min to 1 min. The test validates long-running query completion — 1 minute is sufficient and saves ~4 min per variant. - Split lz4 on/off loop in test_query_with_large_wide_result_set into separate parametrized test cases so xdist can run them on different workers instead of sequentially in one test. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * Address review: inline test methods, drop mixin pattern Per review feedback from jprakash-db: - Remove mixin classes (LargeWideResultSetMixin, etc) — inline the test methods directly into the test classes in test_driver.py - Remove backward-compat LargeQueriesMixin alias (nothing uses it) - Rename _LargeQueryRowHelper — replaced entirely by inlining - Convert large_queries_mixin.py to just a fetch_rows() helper function Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> --------- Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

Signed-off-by: Korijn van Golen <k.vangolen@mapiq.com>

…776) The _extract_versions_from_specifier function stripped a single `~` character from constraint strings, which corrupted PEP 440 compatible release syntax (`~=`) by leaving a stray `=`. For example, `thrift = "~=0.22.0"` produced the invalid constraint `thrift>==0.22.0,<=0.23.0`, breaking every PR's "Unit Tests (min deps)" job since #733 was merged. Add an explicit branch for `~=` that strips both characters before extracting the minimum version. The Poetry-style single `~` branch is preserved for backward compatibility. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* Add comprehensive MST transaction E2E tests Replaces the prior speculative test skeleton with 42 tests across 5 categories: - TestMstCorrectness (18): commit/rollback/isolation/multi-table atomicity/repeatable reads/write conflict/parameterized DML/etc. - TestMstApi (6): DB-API-specific — autocommit, isolation level, error handling. - TestMstMetadata (6): cursor.columns/tables/schemas/catalogs inside a transaction, plus two freshness tests asserting Thrift metadata RPCs are non-transactional (they see concurrent DDL that the txn should not see). - TestMstBlockedSql (9): MSTCheckRule enforcement. Some SHOW/DESCRIBE commands throw + abort txn, others succeed silently on Python/Thrift (diverges from JDBC). Both behaviors are explicitly tested so regressions in either direction are caught. - TestMstExecuteVariants (2): executemany commit/rollback. Parallelisation: - Each test uses a unique Delta table derived from its test name so pytest-xdist workers don't collide on shared state. - Tests that spawn concurrent connections to the same table (repeatable reads, write conflict, freshness) use xdist_group so the concurrent connections within a single test don't conflict with other tests on different workers. Runtime: ~2 minutes on 4 workers (pytest -n 4 --dist=loadgroup), well within the existing e2e budget. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * Fix TestMstBlockedSql: SHOW COLUMNS and DESCRIBE QUERY are blocked CI caught that the initial "not blocked" assertions were wrong — the server returns TRANSACTION_NOT_SUPPORTED.COMMAND for SHOW COLUMNS (ShowDeltaTableColumnsCommand) and DESCRIBE QUERY (DescribeQueryCommand) inside an active transaction. The server's error message explicitly lists the allowed commands: "Only SELECT / INSERT / MERGE / UPDATE / DELETE / DESCRIBE TABLE are supported." DESCRIBE TABLE (basic) remains the only DESCRIBE variant that is allowed. Earlier dogfood runs showed SHOW COLUMNS / DESCRIBE QUERY succeeding — likely because the dogfood warehouse DBR is older than CI. Aligning tests with the current/CI server behavior. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * Address PR review comments - test_auto_start_after_commit: assert the rolled-back id=2 is NOT present (use _get_ids set equality instead of just row count). - test_auto_start_after_rollback: same pattern — assert the rolled-back id=1 is NOT present. - test_commit_without_active_txn_throws: match specific NO_ACTIVE_TRANSACTION server error code to ensure we're catching the right exception, not an unrelated one. Add _get_ids() helper for checking the exact set of persisted ids. Verified 42/42 pass against pecotesting in ~1:36 (4 workers). Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> --------- Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* Add SPOG routing support for account-level vanity URLs SPOG replaces per-workspace hostnames with account-level URLs. When httpPath contains ?o=<workspaceId>, the connector now extracts the workspace ID and injects x-databricks-org-id as an HTTP header on all non-OAuth endpoints (SEA, telemetry, feature flags). Changes: - Fix warehouse ID regex to stop at query params ([^?&]+ instead of .+) - Extract ?o= from httpPath once during session init, store as _spog_headers - Propagate org-id header to telemetry client via extra_headers param - Propagate org-id header to feature flags client - Do NOT propagate to OAuth endpoints (they reject it with 400) Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com> Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com> * Add debug logging for SPOG x-databricks-org-id header extraction Mirrors the JDBC driver's logging pattern. Emits at DEBUG level in three code paths of _extract_spog_headers: 1. http_path has a query string but no ?o= param — log and skip. 2. x-databricks-org-id already set by the caller (via http_headers) — log and skip (don't override explicit user header). 3. Injection happens — log the extracted workspace ID so customers diagnosing SPOG routing can confirm the header was added. Helps with customer support: when a customer reports "SPOG isn't routing correctly", they can enable DEBUG logging and immediately see whether the connector saw their ?o= value. Signed-off-by: Madhavendra Rathore Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com> --------- Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com> Signed-off-by: Madhavendra Rathore

Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>

…d in MST (#778) The server's MSTCheckRule allowlist has been broadened to include SHOW COLUMNS (ShowDeltaTableColumnsCommand). Flip the test to assert SHOW COLUMNS succeeds inside an MST transaction, matching the pattern already used by test_describe_table_not_blocked. Other SHOW variants (SHOW SCHEMAS/TABLES/CATALOGS/FUNCTIONS), DESCRIBE QUERY, DESCRIBE TABLE EXTENDED, and information_schema remain blocked as expected. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

Commits on Mar 30, 2026

Removed Publish workflow (#764 )

jprakash-db authored Mar 30, 2026

Configuration menu

View commit details

Copy full SHA for 4793353

Browse repository at this point

Copy the full SHA

4793353 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparing changes

Open a pull request

Commits on Mar 2, 2026

Commits on Mar 9, 2026

Commits on Mar 10, 2026

Commits on Mar 16, 2026

Commits on Mar 27, 2026

Commits on Mar 30, 2026

Commits on Apr 13, 2026

Commits on Apr 14, 2026

Commits on Apr 20, 2026

Commits on Apr 21, 2026

Commits on Apr 23, 2026

This comparison is taking too long to generate.

Uh oh!