Skip to content

Commit cdf800f

Browse files
authored
Merge pull request #9314 from rothja/installoverview2
New deployment get started for big data clusters
2 parents 4802fd7 + 6951a9f commit cdf800f

3 files changed

Lines changed: 100 additions & 21 deletions

File tree

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
---
2+
title: Get started
3+
titleSuffix: SQL Server 2019 big data clusters
4+
description: Learn the steps and resources for deploying SQL Server 2019 big data clusters (preview).
5+
author: rothja
6+
ms.author: jroth
7+
manager: craigg
8+
ms.date: 03/18/2019
9+
ms.topic: conceptual
10+
ms.prod: sql
11+
ms.technology: big-data-cluster
12+
---
13+
14+
# Get started with SQL Server 2019 big data clusters
15+
16+
This article provides an overview of how to deploy a [SQL Server 2019 big data cluster (preview)](big-data-cluster-overview.md). It is meant to orient you to the concepts and provide a framework for understanding the other deployment articles in this section. Your specific deployment steps vary based on your platform choices for the client and server.
17+
18+
## <a id="tools"></a> Client tools
19+
20+
Big data clusters require a specific set of client tools. Before you deploy a big data cluster to Kubernetes, you should install the following tools:
21+
22+
| Tool | Description |
23+
|---|---|
24+
| **mssqlctl** | Deploys and manages big data clusters. |
25+
| **kubectl** | Creates and manages the underlying Kubernetes cluster. |
26+
| **Azure Data Studio** | Graphical interface for using the big data cluster. |
27+
| **SQL Server 2019 extension** | Azure Data Studio extension that enables big data cluster features. |
28+
29+
Other tools are required for different scenarios. Each article should explain the prerequisite tools for performing a specific task. For a full list of tools and installation links, see [Install SQL Server 2019 big data tools](deploy-big-data-tools.md).
30+
31+
## Kubernetes
32+
33+
Big data clusters are deployed as a series of interrelated containers that are managed in [Kubernetes](https://kubernetes.io/docs/home). You can host Kubernetes in a variety of ways. Even if you already have an existing Kubernetes environment, you should review the related requirements for big data clusters.
34+
35+
- **Azure Kubernetes Service (AKS)**: AKS allows you to deploy a managed Kubernetes cluster in Azure. You only manage and maintain the agent nodes. With AKS, you don't have to provision your own hardware for the cluster. It is also easy to use a big data cluster [deployment script](quickstart-big-data-cluster-deploy.md) to create the AKS cluster and deploy the big data cluster in one step. For more information about using AKS with big data clusters, see [Configure Azure Kubernetes Service for SQL Server 2019 big data cluster (preview) deployments](deploy-on-aks.md).
36+
37+
- **Multiple machines**: You can also deploy Kubernetes to multiple Linux machines, which could be physical servers or virtual machines. The [kubeadm](https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/) tool can be used to create the Kubernetes cluster. This method works well if you already have existing infrastructure that you want to use for your big data cluster. For more information about using **kubeadm** deployments with big data clusters, see [Configure Kubernetes on multiple machines for SQL Server 2019 big data cluster (preview) deployments](deploy-with-kubeadm.md).
38+
39+
- **Minikube**: Minikube allows you to run Kubernetes locally on a single server. It is a good option if you are trying out big data clusters or need to use it in a testing or development scenario. For more information about using Minikube, see the [Minikube documentation](https://kubernetes.io/docs/setup/minikube/). For specific requirements for using Minikube with big data clusters, see [Configure minikube for SQL Server 2019 big data cluster deployments](deploy-on-minikube.md).
40+
41+
## Deployment scripts
42+
43+
Deployment scripts can help deploy both Kubernetes and big data clusters in a single step. They also often provide default values for the required environment variables. For an example of a deployment script for big data cluster on Azure Kubernetes Service (AKS), see [Deploy a SQL Server 2019 big data cluster with a deployment script (AKS)](quickstart-big-data-cluster-deploy.md).
44+
45+
You can customize any deployment script by creating your own version that configures the big data cluster environment variables differently.
46+
47+
## Deploy a big data cluster
48+
49+
To deploy Kubernetes and a big data cluster to AKS with a single script, see the following example:
50+
51+
- [Deploy a SQL Server 2019 big data cluster with a deployment script (AKS)](quickstart-big-data-cluster-deploy.md)
52+
53+
For detailed deployment guidance for deploying big data clusters using AKS, kubeadm, and MiniKube, see the following article:
54+
55+
- [How to deploy SQL Server big data clusters on Kubernetes](deployment-guidance.md)
56+
57+
## Next steps
58+
59+
After you successfully deploy a big data cluster, [connect to the cluster](connect-to-big-data-cluster.md) and consider [loading sample data](tutorial-load-sample-data.md) for use with several walkthroughs.

docs/big-data-cluster/tutorial-load-sample-data.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,4 +110,18 @@ The following steps describe how to use a Linux client to load the sample data i
110110

111111
## Next steps
112112

113-
After the bootstrap script runs, your big data cluster has the sample databases and HDFS data. To start exploring this data and big data clusters, see the [Tutorials](tutorial-query-hdfs-storage-pool.md) in this section.
113+
After the bootstrap script runs, your big data cluster has the sample databases and HDFS data. The following tutorials use the sample data to demonstrate big data cluster capabilities:
114+
115+
Data Virtualization:
116+
117+
- [Tutorial: Query HDFS in a SQL Server big data cluster](tutorial-query-hdfs-storage-pool.md)
118+
- [Tutorial: Query Oracle from a SQL Server big data cluster](tutorial-query-oracle.md)
119+
120+
Data ingestion:
121+
122+
- [Tutorial: Ingest data into a SQL Server data pool with Transact-SQL](tutorial-data-pool-ingest-sql.md)
123+
- [Tutorial: Ingest data into a SQL Server data pool with Spark jobs](tutorial-data-pool-ingest-spark.md)
124+
125+
Notebooks:
126+
127+
- [Tutorial: Run a sample notebook on a SQL Server 2019 big data cluster](tutorial-notebook-spark.md)

docs/toc/toc.yml

Lines changed: 26 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,12 @@
5151
href: ../big-data-cluster/release-notes-big-data-cluster.md
5252
- name: Install
5353
items:
54+
- name: Get Started
55+
items:
56+
- name: Deploy a big data cluster
57+
href: ../big-data-cluster/deploy-get-started.md
58+
- name: Load sample data
59+
href: ../big-data-cluster/tutorial-load-sample-data.md
5460
- name: Install Tools
5561
items:
5662
- name: Install big data tools
@@ -71,20 +77,6 @@
7177
href: ../big-data-cluster/deploy-with-kubeadm.md
7278
- name: Minikube
7379
href: ../big-data-cluster/deploy-on-minikube.md
74-
- name: Tutorials
75-
items:
76-
- name: 1_Load sample data
77-
href: ../big-data-cluster/tutorial-load-sample-data.md
78-
- name: 2_Query HDFS data
79-
href: ../big-data-cluster/tutorial-query-hdfs-storage-pool.md
80-
- name: 3_Query Oracle data
81-
href: ../big-data-cluster/tutorial-query-oracle.md
82-
- name: 4_Load data pool with SQL
83-
href: ../big-data-cluster/tutorial-data-pool-ingest-sql.md
84-
- name: 5_Load data pool with Spark
85-
href: ../big-data-cluster/tutorial-data-pool-ingest-spark.md
86-
- name: 6_Run a Spark notebook
87-
href: ../big-data-cluster/tutorial-notebook-spark.md
8880
- name: Concepts
8981
items:
9082
- name: Controller
@@ -105,16 +97,28 @@
10597
items:
10698
- name: Data virtualization
10799
items:
108-
- name: Virtualize relational data
109-
href: ../relational-databases/polybase/data-virtualization.md
110-
- name: Virtualize CSV data
111-
href: ../relational-databases/polybase/data-virtualization-csv.md
112-
- name: HDFS tiering
113-
href: ../big-data-cluster/hdfs-tiering.md
100+
- name: Relational data
101+
items:
102+
- name: Virtualize relational data
103+
href: ../relational-databases/polybase/data-virtualization.md
104+
- name: Query Oracle data
105+
href: ../big-data-cluster/tutorial-query-oracle.md
106+
- name: HDFS data
107+
items:
108+
- name: Virtualize CSV data
109+
href: ../relational-databases/polybase/data-virtualization-csv.md
110+
- name: Query HDFS data
111+
href: ../big-data-cluster/tutorial-query-hdfs-storage-pool.md
112+
- name: HDFS tiering
113+
href: ../big-data-cluster/hdfs-tiering.md
114114
- name: Data ingestion
115115
items:
116116
- name: Load data with curl
117117
href: ../big-data-cluster/data-ingestion-curl.md
118+
- name: Load data with SQL
119+
href: ../big-data-cluster/tutorial-data-pool-ingest-sql.md
120+
- name: Load data with Spark
121+
href: ../big-data-cluster/tutorial-data-pool-ingest-spark.md
118122
- name: Restore a database
119123
href: ../big-data-cluster/data-ingestion-restore-database.md
120124
- name: Deploy and consume apps
@@ -149,6 +153,8 @@
149153
items:
150154
- name: Notebooks overview
151155
href: ../big-data-cluster/notebooks-guidance.md
156+
- name: Notebook tutorial
157+
href: ../big-data-cluster/tutorial-notebook-spark.md
152158
- name: Manage existing notebooks
153159
href: ../big-data-cluster/notebooks-how-to-manage.md
154160
- name: Spark jobs

0 commit comments

Comments
 (0)