Skip to content

feat: Implement Databricks Unity Catalog offline store integration#6515

Open
falloficaruss wants to merge 3 commits into
feast-dev:masterfrom
falloficaruss:feat/databricks-uc-provider
Open

feat: Implement Databricks Unity Catalog offline store integration#6515
falloficaruss wants to merge 3 commits into
feast-dev:masterfrom
falloficaruss:feat/databricks-uc-provider

Conversation

@falloficaruss

Copy link
Copy Markdown

What this PR does / why we need it:

  • Implemented DatabricksUCOfflineStoreConfig (inheriting from SparkOfflineStoreConfig) to specify workspace, token, cluster_id, default_catalog, and default_schema.
  • Implemented get_databricks_session helper that securely handles active sessions or connects remotely using Databricks Connect V2 configurations.
  • Implemented DatabricksUCOfflineStore which inherits from SparkOfflineStore, overriding operations to guarantee Databricks session initialization before executing queries.
  • Registered "databricks_uc" in the first-class OFFLINE_STORE_CLASS_FOR_TYPE mapping.

Which issue(s) this PR fixes:

Partially Fixes - #6499

Checks

  • I've made sure the tests are passing.
  • My commits are signed off (git commit -s)
  • My PR title follows conventional commits format

Testing Strategy

  • Unit tests
  • Integration tests
  • Manual tests
  • Testing is not required for this change

Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
@falloficaruss falloficaruss requested a review from a team as a code owner June 14, 2026 10:15
@franciscojavierarceo

Copy link
Copy Markdown
Member

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3a4edcdb4b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

return spark_session


class DatabricksUCOfflineStore(SparkOfflineStore):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Initialize Databricks sessions during source validation

When users run feast apply or feast plan with source validation enabled, this subclass still inherits OfflineStore.validate_data_source and get_table_column_names_and_types_from_data_source, so validation goes through SparkSource.validate() and starts a plain Spark session via get_spark_session_or_start_new_with_repoconfig instead of this Databricks helper. In a local Databricks Connect setup with no active Databricks session, UC tables are validated against a local/non-UC Spark session and fail before any of the wrapped retrieval methods run; override these validation/schema methods to initialize the Databricks session too.

Useful? React with 👍 / 👎.

…tion methods

Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants