Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: databricks/databricks-sql-python
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 9fe7356
Choose a base ref
...
head repository: databricks/databricks-sql-python
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: ee63b81
Choose a head ref
  • 17 commits
  • 37 files changed
  • 8 contributors

Commits on Mar 2, 2026

  1. Add query_tags parameter support for execute methods (#736)

    * Add statement level query tag support by introducing it as a parameter on execute* methods
    
    Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com>
    
    * Add query_tags support to executemany method
    
    - Added query_tags parameter to executemany() method
    - Query tags are applied to all queries in the batch
    - Updated example to demonstrate executemany usage with query_tags
    - All tests pass (122/122 client tests)
    
    Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com>
    
    * add example that doesn't have tag
    
    Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com>
    
    * fix presubmit errors
    
    Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com>
    
    * another lint
    
    Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com>
    
    * address review comments
    
    Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com>
    
    ---------
    
    Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com>
    jiabin-hu authored Mar 2, 2026
    Configuration menu
    Copy the full SHA
    38097f2 View commit details
    Browse the repository at this point in the history

Commits on Mar 9, 2026

  1. [QI-3367] Allow specifiying query tags as a dict upon connection crea…

    …tion (#749)
    
    * Allow specifiying query tags as a dict upon connection creation
    
    Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com>
    
    * fix comment
    
    Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com>
    
    ---------
    
    Signed-off-by: Jiabin Hu <jiabin.hu@databricks.com>
    jiabin-hu authored Mar 9, 2026
    Configuration menu
    Copy the full SHA
    e916f71 View commit details
    Browse the repository at this point in the history

Commits on Mar 10, 2026

  1. Fix float inference to use DoubleParameter (64-bit) instead of FloatP… (

    #742)
    
    * Fix float inference to use DoubleParameter (64-bit) instead of FloatParameter (32-bit)
    
    Signed-off-by: Shubhambhusate <bhusates6@gmail.com>
    
    * Add DoubleParameter with Primitive.DOUBLE to test_inference coverage
    
    ---------
    
    Signed-off-by: Shubhambhusate <bhusates6@gmail.com>
    Shubhambhusate authored Mar 10, 2026
    Configuration menu
    Copy the full SHA
    12bfd5b View commit details
    Browse the repository at this point in the history
  2. Updated the PyArrow concatenation of tables to use promote_options as…

    … default (#751)
    
    Updated pyarrow-concat
    jprakash-db authored Mar 10, 2026
    Configuration menu
    Copy the full SHA
    36fb376 View commit details
    Browse the repository at this point in the history

Commits on Mar 16, 2026

  1. Add statement-level query_tags support for SEA backend (#754)

    * Add statement-level query_tags support for SEA backend
    
    Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
    
    * Simplify None handling in query_tags serialization
    
    Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
    
    ---------
    
    Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>
    sreekanth-db authored Mar 16, 2026
    Configuration menu
    Copy the full SHA
    ca4d7bc View commit details
    Browse the repository at this point in the history

Commits on Mar 27, 2026

  1. Harden CI/CD workflows: fix secret exposure, script injection, and pi…

    …n all actions to SHA (#762)
    
    Updaed the security
    jprakash-db authored Mar 27, 2026
    Configuration menu
    Copy the full SHA
    330c445 View commit details
    Browse the repository at this point in the history

Commits on Mar 30, 2026

  1. Configuration menu
    Copy the full SHA
    4793353 View commit details
    Browse the repository at this point in the history

Commits on Apr 13, 2026

  1. Replace third-party DCO action with custom script (#769)

    The tisonkun/actions-dco action has been unreliable. Replace it with an
    inline bash script (matching databricks-sql-go) that checks each commit
    for a Signed-off-by line, provides clear per-commit feedback, and scopes
    the trigger to opened/synchronize/reopened events on main.
    
    Co-authored-by: Isaac
    
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    vikrantpuppala authored Apr 13, 2026
    Configuration menu
    Copy the full SHA
    e056275 View commit details
    Browse the repository at this point in the history
  2. Migrate CI to protected runners and JFrog PyPI proxy (#770)

    * Migrate CI to databricks-protected runners and route PyPI through JFrog
    
    Protected runners are required for Databricks OSS repos. Add a
    setup-jfrog composite action (OIDC-based, matching databricks-odbc) that
    sets PIP_INDEX_URL so all pip/poetry installs go through the JFrog PyPI
    proxy. Every workflow now runs on the databricks-protected-runner-group
    with id-token: write for the OIDC exchange.
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    * Add Poetry JFrog source configuration to all workflows
    
    The previous commit only set PIP_INDEX_URL, but Poetry uses its own
    resolver and needs explicit source configuration. Add a
    "Configure Poetry for JFrog" step after poetry install in every job
    that sets up the JFrog repository and credentials, then adds it as
    the primary source for the project.
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    * Fix step ordering: move JFrog setup after poetry install
    
    The snok/install-poetry action uses pip internally to install poetry.
    When PIP_INDEX_URL was set before this step, the installer tried to
    route through JFrog and failed with an SSL error. Move the JFrog OIDC
    token + PIP_INDEX_URL + poetry source configuration to run after
    Install Poetry but before poetry install.
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    * Replace snok/install-poetry with pip install through JFrog
    
    The hardened runners block direct access to install.python-poetry.org,
    causing snok/install-poetry to fail with SSL errors. Replace it with
    `pip install poetry==2.2.1` which routes through the JFrog PyPI proxy.
    
    New step ordering: checkout → setup-python → Setup JFrog (OIDC +
    PIP_INDEX_URL) → pip install poetry → Configure Poetry for JFrog →
    poetry install.
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    * Add poetry lock --no-update after source add to fix lock mismatch
    
    poetry source add modifies pyproject.toml, which makes poetry refuse
    to install from the existing lock file. Running poetry lock --no-update
    regenerates the lock file metadata without changing dependency versions.
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    * Fix poetry lock flag and YAML indentation
    
    Poetry 2.x doesn't have --no-update flag, use poetry lock instead.
    Also fix indentation of poetry lock in the arrow test job.
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    * Move JFrog setup before setup-python, matching sqlalchemy pattern
    
    Follow the proven pattern from databricks/databricks-sqlalchemy#59:
    checkout → Setup JFrog → setup-python → pip install poetry → poetry
    source add + poetry lock → poetry install.
    
    The hardened runners block pypi.org at the network level, so JFrog
    must be configured before actions/setup-python (which upgrades pip).
    Also simplified workflows by removing verbose section comments.
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    * Extract setup-poetry composite action to remove duplication
    
    Create .github/actions/setup-poetry that bundles JFrog setup,
    setup-python, poetry install via pip, JFrog source config, cache,
    and dependency install into a single reusable action with inputs
    for python-version, install-args, cache-path, and cache-suffix.
    
    All workflows now call setup-poetry instead of repeating these steps,
    matching the pattern from databricks/databricks-sqlalchemy#59.
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    ---------
    
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    vikrantpuppala authored Apr 13, 2026
    Configuration menu
    Copy the full SHA
    fbdcd32 View commit details
    Browse the repository at this point in the history
  3. [PECOBLR-1928] Add AI coding agent detection to User-Agent header (#740)

    Add AI coding agent detection to User-Agent header
    
    Detect when the Python SQL connector is invoked by an AI coding agent
    (e.g. Claude Code, Cursor, Gemini CLI) by checking well-known
    environment variables, and append `agent/<product>` to the User-Agent
    string.
    
    This enables Databricks to understand how much driver usage originates
    from AI coding agents. Detection only succeeds when exactly one agent
    is detected to avoid ambiguous attribution.
    
    Mirrors the approach in databricks/cli#4287.
    
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
    vikrantpuppala and claude authored Apr 13, 2026
    Configuration menu
    Copy the full SHA
    32c446b View commit details
    Browse the repository at this point in the history

Commits on Apr 14, 2026

  1. Optimize CI: consolidate workflows, fix caching, speed up e2e tests (…

    …47min → 15min) (#772)
    
    * Optimize CI: consolidate workflows, fix caching, speed up e2e tests
    
    Workflow consolidation:
    - Delete integration.yml and daily-telemetry-e2e.yml (redundant with
      coverage workflow which already runs all e2e tests)
    - Add push-to-main trigger to coverage workflow
    - Run all tests (including telemetry) in single pytest invocation with
      --dist=loadgroup to respect xdist_group markers for isolation
    
    Fix pyarrow cache:
    - Remove cache-path: .venv-pyarrow from pyarrow jobs. Poetry always
      creates .venv regardless of the cache-path input, so the cache was
      never saved ("Path does not exist" error). The cache-suffix already
      differentiates keys between variants.
    
    Fix 3.14 post-test DNS hang:
    - Add enable_telemetry=False to unit test DUMMY_CONNECTION_ARGS that
      use server_hostname="foo". This prevents FeatureFlagsContext from
      making real HTTP calls to fake hosts, eliminating ~8min hang from
      ThreadPoolExecutor threads timing out on DNS on protected runners.
    
    Improve e2e test parallelization:
    - Split TestPySQLLargeQueriesSuite into 3 separate classes
      (TestPySQLLargeWideResultSet, TestPySQLLargeNarrowResultSet,
      TestPySQLLongRunningQuery) so xdist distributes them across workers
      instead of all landing on one.
    
    Speed up slow tests:
    - Reduce large result set sizes from 300MB to 100MB (still validates
      large fetches, lz4, chunking, row integrity)
    - Start test_long_running_query at scale_factor=50 instead of 1 to
      skip ramp-up iterations that finish instantly
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    * Further optimize e2e: 4 workers, lower long-query threshold, split lz4
    
    - Use -n 4 instead of -n auto in coverage workflow. The e2e tests are
      network-bound (waiting on warehouse), not CPU-bound, so 4 workers on
      a 2-CPU runner is fine and doubles parallelism.
    - Lower test_long_running_query min_duration from 3 min to 1 min.
      The test validates long-running query completion — 1 minute is
      sufficient and saves ~4 min per variant.
    - Split lz4 on/off loop in test_query_with_large_wide_result_set into
      separate parametrized test cases so xdist can run them on different
      workers instead of sequentially in one test.
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    * Address review: inline test methods, drop mixin pattern
    
    Per review feedback from jprakash-db:
    - Remove mixin classes (LargeWideResultSetMixin, etc) — inline the
      test methods directly into the test classes in test_driver.py
    - Remove backward-compat LargeQueriesMixin alias (nothing uses it)
    - Rename _LargeQueryRowHelper — replaced entirely by inlining
    - Convert large_queries_mixin.py to just a fetch_rows() helper function
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    ---------
    
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    vikrantpuppala authored Apr 14, 2026
    Configuration menu
    Copy the full SHA
    c46b3a0 View commit details
    Browse the repository at this point in the history

Commits on Apr 20, 2026

  1. Bump thrift to fix deprecation warning (#733)

    Signed-off-by: Korijn van Golen <k.vangolen@mapiq.com>
    Korijn authored Apr 20, 2026
    Configuration menu
    Copy the full SHA
    9031863 View commit details
    Browse the repository at this point in the history

Commits on Apr 21, 2026

  1. Fix dependency_manager: handle PEP 440 ~= compatible release syntax (#…

    …776)
    
    The _extract_versions_from_specifier function stripped a single `~`
    character from constraint strings, which corrupted PEP 440 compatible
    release syntax (`~=`) by leaving a stray `=`. For example,
    `thrift = "~=0.22.0"` produced the invalid constraint
    `thrift>==0.22.0,<=0.23.0`, breaking every PR's "Unit Tests (min deps)"
    job since #733 was merged.
    
    Add an explicit branch for `~=` that strips both characters before
    extracting the minimum version. The Poetry-style single `~` branch is
    preserved for backward compatibility.
    
    Co-authored-by: Isaac
    
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    vikrantpuppala authored Apr 21, 2026
    Configuration menu
    Copy the full SHA
    b088a35 View commit details
    Browse the repository at this point in the history
  2. [PECOBLR-2461] Add comprehensive MST transaction E2E tests (#775)

    * Add comprehensive MST transaction E2E tests
    
    Replaces the prior speculative test skeleton with 42 tests across 5
    categories:
    
    - TestMstCorrectness (18): commit/rollback/isolation/multi-table
      atomicity/repeatable reads/write conflict/parameterized DML/etc.
    - TestMstApi (6): DB-API-specific — autocommit, isolation level,
      error handling.
    - TestMstMetadata (6): cursor.columns/tables/schemas/catalogs inside
      a transaction, plus two freshness tests asserting Thrift metadata
      RPCs are non-transactional (they see concurrent DDL that the txn
      should not see).
    - TestMstBlockedSql (9): MSTCheckRule enforcement. Some SHOW/DESCRIBE
      commands throw + abort txn, others succeed silently on Python/Thrift
      (diverges from JDBC). Both behaviors are explicitly tested so
      regressions in either direction are caught.
    - TestMstExecuteVariants (2): executemany commit/rollback.
    
    Parallelisation:
    - Each test uses a unique Delta table derived from its test name so
      pytest-xdist workers don't collide on shared state.
    - Tests that spawn concurrent connections to the same table
      (repeatable reads, write conflict, freshness) use xdist_group so
      the concurrent connections within a single test don't conflict with
      other tests on different workers.
    
    Runtime: ~2 minutes on 4 workers (pytest -n 4 --dist=loadgroup),
    well within the existing e2e budget.
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    * Fix TestMstBlockedSql: SHOW COLUMNS and DESCRIBE QUERY are blocked
    
    CI caught that the initial "not blocked" assertions were wrong — the
    server returns TRANSACTION_NOT_SUPPORTED.COMMAND for SHOW COLUMNS
    (ShowDeltaTableColumnsCommand) and DESCRIBE QUERY (DescribeQueryCommand)
    inside an active transaction.
    
    The server's error message explicitly lists the allowed commands:
    "Only SELECT / INSERT / MERGE / UPDATE / DELETE / DESCRIBE TABLE are
    supported." DESCRIBE TABLE (basic) remains the only DESCRIBE variant
    that is allowed.
    
    Earlier dogfood runs showed SHOW COLUMNS / DESCRIBE QUERY succeeding —
    likely because the dogfood warehouse DBR is older than CI. Aligning
    tests with the current/CI server behavior.
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    * Address PR review comments
    
    - test_auto_start_after_commit: assert the rolled-back id=2 is NOT
      present (use _get_ids set equality instead of just row count).
    - test_auto_start_after_rollback: same pattern — assert the
      rolled-back id=1 is NOT present.
    - test_commit_without_active_txn_throws: match specific
      NO_ACTIVE_TRANSACTION server error code to ensure we're catching
      the right exception, not an unrelated one.
    
    Add _get_ids() helper for checking the exact set of persisted ids.
    
    Verified 42/42 pass against pecotesting in ~1:36 (4 workers).
    
    Co-authored-by: Isaac
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    
    ---------
    
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    vikrantpuppala authored Apr 21, 2026
    Configuration menu
    Copy the full SHA
    d872075 View commit details
    Browse the repository at this point in the history
  3. Add SPOG routing support for account-level vanity URLs (#767)

    * Add SPOG routing support for account-level vanity URLs
    
    SPOG replaces per-workspace hostnames with account-level URLs. When
    httpPath contains ?o=<workspaceId>, the connector now extracts the
    workspace ID and injects x-databricks-org-id as an HTTP header on all
    non-OAuth endpoints (SEA, telemetry, feature flags).
    
    Changes:
    - Fix warehouse ID regex to stop at query params ([^?&]+ instead of .+)
    - Extract ?o= from httpPath once during session init, store as _spog_headers
    - Propagate org-id header to telemetry client via extra_headers param
    - Propagate org-id header to feature flags client
    - Do NOT propagate to OAuth endpoints (they reject it with 400)
    
    Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
    
    Co-authored-by: Isaac
    Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
    
    * Add debug logging for SPOG x-databricks-org-id header extraction
    
    Mirrors the JDBC driver's logging pattern. Emits at DEBUG level in three
    code paths of _extract_spog_headers:
    
    1. http_path has a query string but no ?o= param — log and skip.
    2. x-databricks-org-id already set by the caller (via http_headers) —
       log and skip (don't override explicit user header).
    3. Injection happens — log the extracted workspace ID so customers
       diagnosing SPOG routing can confirm the header was added.
    
    Helps with customer support: when a customer reports "SPOG isn't
    routing correctly", they can enable DEBUG logging and immediately see
    whether the connector saw their ?o= value.
    
    Signed-off-by: Madhavendra Rathore
    Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
    
    ---------
    
    Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
    Signed-off-by: Madhavendra Rathore
    msrathore-db authored Apr 21, 2026
    Configuration menu
    Copy the full SHA
    ff921ed View commit details
    Browse the repository at this point in the history

Commits on Apr 23, 2026

  1. Bump to version 4.2.6 (#777)

    Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
    msrathore-db authored Apr 23, 2026
    Configuration menu
    Copy the full SHA
    2926daa View commit details
    Browse the repository at this point in the history
  2. [PECOBLR-2461] Fix test_show_columns_blocked: SHOW COLUMNS now allowe…

    …d in MST (#778)
    
    The server's MSTCheckRule allowlist has been broadened to include
    SHOW COLUMNS (ShowDeltaTableColumnsCommand). Flip the test to assert
    SHOW COLUMNS succeeds inside an MST transaction, matching the pattern
    already used by test_describe_table_not_blocked.
    
    Other SHOW variants (SHOW SCHEMAS/TABLES/CATALOGS/FUNCTIONS),
    DESCRIBE QUERY, DESCRIBE TABLE EXTENDED, and information_schema remain
    blocked as expected.
    
    Co-authored-by: Isaac
    
    Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
    vikrantpuppala authored Apr 23, 2026
    Configuration menu
    Copy the full SHA
    ee63b81 View commit details
    Browse the repository at this point in the history
Loading