Skip to content

Commit cf19f41

Browse files
committed
pushing latest changes
1 parent cbcbf18 commit cf19f41

9 files changed

Lines changed: 143 additions & 96 deletions
134 KB
Loading
133 KB
Loading

docs/relational-databases/polybase/polybase-configure-mongodb.md

Lines changed: 61 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -25,19 +25,17 @@ If you haven't installed PolyBase, see [PolyBase installation](polybase-installa
2525

2626
## Configure an External Table
2727

28-
To query the data from an MongoDB data source, you must create external tables to reference the external data. This section provides sample code to create these external tables.
29-
28+
To query the data from an MongoDB data source, you must create external tables to reference the external data. This section provides sample code to create these external tables.
29+
3030
We recommend creating statistics on external table columns, especially the ones used for joins, filters and aggregates, for optimal query performance.
3131

3232
These objects will create in this section:
3333

34-
- CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)
35-
- CREATE EXTERNAL DATA SOURCE (Transact-SQL)
36-
- CREATE EXTERNAL FILE FORMAT (Transact-SQL)
37-
- CREATE EXTERNAL TABLE (Transact-SQL)
34+
- CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)
35+
- CREATE EXTERNAL DATA SOURCE (Transact-SQL)
36+
- CREATE EXTERNAL TABLE (Transact-SQL)
3837
- CREATE STATISTICS (Transact-SQL)
3938

40-
4139
1. Create a master key on the database. This is required to encrypt the credential secret.
4240

4341
```sql
@@ -58,16 +56,16 @@ These objects will create in this section:
5856
1. Create an external data source with [CREATE EXTERNAL DATA SOURCE](../../t-sql/statements/create-external-data-source-transact-sql.md). Specify external data source location and credentials for the MongoDB data source.
5957

6058
```sql
61-
/* LOCATION: Server DNS name or IP address.
62-
* PUSHDOWN: specify whether computation should be pushed down to the source. ON by default.
63-
* CREDENTIAL: the database scoped credential, created above.
64-
*/
65-
CREATE EXTERNAL DATA SOURCE MongoDBInstance
66-
WITH (
67-
LOCATION = '<vendor>://<server>[:<port>]',
68-
-- PUSHDOWN = ON | OFF,
69-
, CREDENTIAL = MongoDBCredentials
70-
);
59+
/* LOCATION: Location string should be of format '<vendor>://<server>[:<port>]'.
60+
* PUSHDOWN: specify whether computation should be pushed down to the source. ON by default.
61+
* CREDENTIAL: the database scoped credential, created above.
62+
*/
63+
CREATE EXTERNAL DATA SOURCE MongoInstance
64+
WITH (
65+
LOCATION = mongodb://MongoServer,
66+
-- PUSHDOWN = ON | OFF,
67+
CREDENTIAL = TeradataCredentials
68+
);
7169
```
7270

7371
1. Create schemas for external data
@@ -78,7 +76,7 @@ These objects will create in this section:
7876
```
7977

8078
1. Create external tables that represents data stored in external MongoDB system [CREATE EXTERNAL TABLE](../../t-sql/statements/create-external-table-transact-sql.md).
81-
79+
8280
```sql
8381
/* LOCATION: MongoDB table/view in '<database_name>.<schema_name>.<object_name>' format
8482
* DATA_SOURCE: the external data source, created above.
@@ -103,9 +101,52 @@ These objects will create in this section:
103101
CREATE STATISTICS OrdersOrderKeyStatistics ON MongoDB.orders(O_ORDERKEY) WITH FULLSCAN;
104102
```
105103

106-
##Flattening
104+
## Flattening
105+
flattening is enable for nested and repeated data from MongoDB document collections. User is required to enable create an external table and explicitly specify a relational schema over MongoDB document collections that may have nested and/or repeated data. We will enable auto-schema detection over mongo document collections in future milestones.
106+
JSON nested/repeated data types will be flattened as follows
107+
108+
* Object: unordered key/value collection enclosed in curly braces (nested)
109+
110+
- We will create a table column for each object key
111+
112+
* Column Name: objectname_keyname
113+
114+
* Array: ordered values, separated by commas, enclosed in square brackets (repeated)
115+
116+
- We will add a new table row for each array item
117+
118+
- We will create a column per array to store the array item index
119+
120+
* Column Name: arrayname_index
121+
122+
* Data Type: bigint
123+
124+
There are several potential issues with this technique, two of them being:
125+
126+
* an empty repeated field will effectively mask the data contained in the flat fields of the same record
127+
128+
* the presence of multiple repeated fields can result in an explosion of the number of produced rows
129+
130+
As an example, we evaluate the MongoDB sample dataset restaurant collection stored in non-relational JSON format. Each restaurant has a nested address field and an array of grades it was assigned on different days. The figure below illustrates a typical restaurant with nested address and nested-repeated grades.
131+
132+
![MongoDB flattening](../../relational-databases/polybase/media/mongo-flattening.png "MongoDB restaurant flattening")
133+
134+
Object address will be flattened as below:
135+
136+
* Nested field restaurant.address.building becomes restaurant.address_building
137+
* Nested field restaurant.address.coord becomes restaurant.address_coord
138+
* Nested field restaurant.address.street becomes restaurant.address_street
139+
* Nested field restaurant.address.zipcode becomes restaurant.address_zipcode
140+
141+
Array grades will be flattened as below:
142+
| grades_date | grades_grade | games_score |
143+
| ------------- | ------------------------- | -------------- |
144+
|1393804800000 |A |2|
145+
|1378857600000|A |6|
146+
|135898560000 |A |10|
147+
|1322006400000|A |9|
148+
|1299715200000 |B |14|
107149

108-
109150
## Next steps
110151

111152
To learn more about PolyBase, see [Overview of SQL Server PolyBase](polybase-guide.md).

docs/relational-databases/polybase/polybase-configure-oracle.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,6 @@ These objects will create in this section:
3333

3434
- CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)
3535
- CREATE EXTERNAL DATA SOURCE (Transact-SQL)
36-
- CREATE EXTERNAL FILE FORMAT (Transact-SQL)
3736
- CREATE EXTERNAL TABLE (Transact-SQL)
3837
- CREATE STATISTICS (Transact-SQL)
3938

@@ -58,16 +57,17 @@ These objects will create in this section:
5857
1. Create an external data source with [CREATE EXTERNAL DATA SOURCE](../../t-sql/statements/create-external-data-source-transact-sql.md). Specify external data source location and credentials for the Oracle data source.
5958

6059
```sql
61-
/* LOCATION: Server DNS name or IP address.
62-
* PUSHDOWN: specify whether computation should be pushed down to the source. ON by default.
63-
* CREDENTIAL: the database scoped credential, created above.
64-
*/
65-
CREATE EXTERNAL DATA SOURCE OracleInstance
66-
WITH (
67-
LOCATION = '<vendor>://<server>[:<port>]',
68-
-- PUSHDOWN = ON | OFF,
69-
CREDENTIAL = OracleCredentials
70-
);
60+
/* LOCATION: Location string should be of format '<vendor>://<server>[:<port>]'.
61+
* PUSHDOWN: specify whether computation should be pushed down to the source. ON by default.
62+
* CREDENTIAL: the database scoped credential, created above.
63+
*/
64+
CREATE EXTERNAL DATA SOURCE OracleInstance
65+
WITH (
66+
LOCATION = oracle://OracleServer,
67+
-- PUSHDOWN = ON | OFF,
68+
CREDENTIAL = TeradataCredentials
69+
);
70+
7171
```
7272

7373
1. Create schemas for external data

docs/relational-databases/polybase/polybase-configure-sql-server.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,6 @@ These objects will create in this section:
3333

3434
- CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)
3535
- CREATE EXTERNAL DATA SOURCE (Transact-SQL)
36-
- CREATE EXTERNAL FILE FORMAT (Transact-SQL)
3736
- CREATE EXTERNAL TABLE (Transact-SQL)
3837
- CREATE STATISTICS (Transact-SQL)
3938
.
@@ -58,16 +57,17 @@ These objects will create in this section:
5857
1. Create an external data source with [CREATE EXTERNAL DATA SOURCE](../../t-sql/statements/create-external-data-source-transact-sql.md).Specify external data source location and credentials for SQL Server.
5958

6059
```sql
61-
/* LOCATION: Server DNS name or IP address.
62-
* PUSHDOWN: specify whether computation should be pushed down to the source. ON by default.
63-
* CREDENTIAL: the database scoped credential, created above.
64-
*/
65-
CREATE EXTERNAL DATA SOURCE SqlServerInstance
66-
WITH (
67-
LOCATION = '<vendor>://<server>[:<port>]',
68-
-- PUSHDOWN = ON | OFF,
69-
CREDENTIAL = SqlServerCredentials
70-
);
60+
/* LOCATION: Location string should be of format '<vendor>://<server>[:<port>]'.
61+
* PUSHDOWN: specify whether computation should be pushed down to the source. ON by default.
62+
* CREDENTIAL: the database scoped credential, created above.
63+
*/
64+
CREATE EXTERNAL DATA SOURCE SQLServerInstance
65+
WITH (
66+
LOCATION = sqlserver://SqlServer,
67+
-- PUSHDOWN = ON | OFF,
68+
CREDENTIAL = TeradataCredentials
69+
);
70+
7171
```
7272

7373
1. Create schemas for external data

docs/relational-databases/polybase/polybase-configure-teradata.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,6 @@ These objects will create in this section:
3535

3636
- CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)
3737
- CREATE EXTERNAL DATA SOURCE (Transact-SQL)
38-
- CREATE EXTERNAL FILE FORMAT (Transact-SQL)
3938
- CREATE EXTERNAL TABLE (Transact-SQL)
4039
- CREATE STATISTICS (Transact-SQL)
4140

@@ -60,16 +59,17 @@ These objects will create in this section:
6059
1. Create an external data source with [CREATE EXTERNAL DATA SOURCE](../../t-sql/statements/create-external-data-source-transact-sql.md).Specify external data source location and credentials for Teradata.
6160

6261
```sql
63-
/* LOCATION: Server DNS name or IP address.
64-
* PUSHDOWN: specify whether computation should be pushed down to the source. ON by default.
65-
* CREDENTIAL: the database scoped credential, created above.
66-
*/
67-
CREATE EXTERNAL DATA SOURCE TeradataInstance
68-
WITH (
69-
LOCATION = '<vendor>://<server>[:<port>]',
70-
-- PUSHDOWN = ON | OFF,
71-
CREDENTIAL = TeradataCredentials
72-
);
62+
/* LOCATION: Location string should be of format '<vendor>://<server>[:<port>]'.
63+
* PUSHDOWN: specify whether computation should be pushed down to the source. ON by default.
64+
* CREDENTIAL: the database scoped credential, created above.
65+
*/
66+
CREATE EXTERNAL DATA SOURCE TeradataInstance
67+
WITH (
68+
LOCATION = teradata://TeradataServer,
69+
-- PUSHDOWN = ON | OFF,
70+
CREDENTIAL = TeradataCredentials
71+
);
72+
7373
```
7474

7575
1. Create schemas for external data

docs/relational-databases/polybase/polybase-installation.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,8 @@ Once you have installed PolyBase as either stand alone or in a scale-out group,
6464
2. Click **Installation**, then click **New Standalone SQL Server installation or add features**.
6565

6666
3. On the feature selection page, select **PolyBase Query Service for External Data**.
67+
68+
![PolyBase services](../../relational-databases/polybase/media/install-wizard.png "PolyBase services")
6769

6870
4. On the Server Configuration Page, configure the **SQL Server PolyBase Engine Service** and SQL Server PolyBase Data Movement Service to run under the same account.
6971

@@ -85,7 +87,9 @@ Once you have installed PolyBase as either stand alone or in a scale-out group,
8587
<!--SQL Server 2019-->
8688
::: moniker range=">= sql-server-ver15"
8789

88-
After installation, you must [enable the PolyBase feature](#enable).
90+
> **IMPORTANT!**
91+
>
92+
> After installation, you must [enable the PolyBase feature](#enable).
8993
9094
::: moniker-end
9195

@@ -152,11 +156,15 @@ Setup.exe /Q /ACTION=INSTALL /IACCEPTSQLSERVERLICENSETERMS /FEATURES=SQLEngine,P
152156
::: moniker range=">= sql-server-ver15"
153157
## <a id="enable"></a> Enable PolyBase
154158

155-
Starting with SQL Server 2019 CTP 2.0, you must enable PolyBase after installation using the following Transact-SQL command:
159+
Once you are done with the installation, Polybase must be enabled to access it's features. connect to SQL Server 2019 CTP 2.0, you must enable PolyBase after installation using the following Transact-SQL command:
160+
156161

157162
```sql
158-
sp_configure @configname = 'polybase enabled', @configvalue = 1;
163+
exec sp_configure @configname = 'polybase enabled', @configvalue = 1;
164+
RECONFIGURE;
159165
```
166+
The instance then needs to be **restarted**
167+
160168

161169
::: moniker-end
162170

0 commit comments

Comments
 (0)