Skip to content

Latest commit

 

History

History
52 lines (34 loc) · 2.49 KB

File metadata and controls

52 lines (34 loc) · 2.49 KB
title SQL Server Big Data Clusters runtime for Apache Spark Guide
titleSuffix SQL Server Big Data Clusters
description SQL Server Big Data Clusters runtime for Apache Spark Guide
author DaniBunny
ms.author dacoelho
ms.reviewer wiassaf
ms.date 12/14/2021
ms.topic conceptual
ms.prod sql
ms.technology big-data-cluster

SQL Server Big Data Clusters runtime for Apache Spark Guide

[!INCLUDESQL Server 2019]

Introducing the SQL Server Big Data Clusters runtime for Apache Spark

The SQL Server Big Data Clusters runtime for Apache Spark is a standardized specification for Apache Spark that enables seamless interoperability between distributions. This Spark runtime is a consistent, versioned block of programming language distributions, engine optimizations, core libraries, and packages.

Every product that uses this runtime specification, will contain the same versions of Apache Spark Core, PySpark, Scala Spark, Spark.R, sparklyr, and .NET for Spark.

All the distributed packages and libraries are also the same. One of the primary goals for the specification is to provide a first-class experience for Data Engineers and Data Scientists by providing a constantly curated and updated list of packages and connectors, out-of-the-box.

Benefits of the SQL Server Big Data Clusters runtime for Apache Spark:

  1. Spark engine optimizations and features available on all products and services
  2. Established release cadence
  3. Seamless interoperability between Spark products and services
  4. Curated packages for Data Engineers and Data Scientists
  5. Consistent package management story

Release cadence and naming standards

The SQL Server Big Data Clusters runtime for Apache Spark specification defines the following:

The runtime naming standard is as follows:

"PRODUCT_NAME.SPARK_MAJOR_VERSION.CALENDAR_YEAR.RELEASE#"

Example is "BDC.3.2021.1".

RELEASE# is a sequential semantic number. It is not bound to months or any other standard. Once a runtime release is created, it is immutable. Each release of SQL Server Big Data Clusters ships with one version of the runtime.

What's in the current runtime release?

The SQL Server Big Data Clusters platform release notes have the runtime name and complete contents of the release.

Next steps

For more information, see [Introducing [!INCLUDEbig-data-clusters-nover]](big-data-cluster-overview.md).