Skip to content

Commit 32ed9f7

Browse files
authored
Merge pull request MicrosoftDocs#2 from MicrosoftDocs/master
Resynch master
2 parents c3b4c67 + 067cbe3 commit 32ed9f7

72 files changed

Lines changed: 1844 additions & 2015 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.openpublishing.redirection.json

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1285,10 +1285,50 @@
12851285
"redirect_url": "/sql/ado/reference/adox-api/adox-object-model",
12861286
"redirect_document_id": true
12871287
},
1288+
{
1289+
"source_path": "docs/database-engine/configure-windows/what-s-new-in-sql-server-vnext-database-engine.md",
1290+
"redirect_url": "/sql/database-engine/configure-windows/what-s-new-in-sql-server-2017-database-engine",
1291+
"redirect_document_id": true
1292+
},
1293+
{
1294+
"source_path": "docs/sql-server/sql-server-vnext-release-notes.md",
1295+
"redirect_url": "/sql/sql-server/sql-server-2017-release-notes",
1296+
"redirect_document_id": true
1297+
},
1298+
{
1299+
"source_path": "docs/sql-server/what-s-new-in-sql-server-vnext.md",
1300+
"redirect_url": "/sql/sql-server/what-s-new-in-sql-server-2017",
1301+
"redirect_document_id": true
1302+
},
1303+
{
1304+
"source_path": "docs/integration-services/what-s-new-in-integration-services-in-sql-server-vnext.md",
1305+
"redirect_url": "/sql/integration-services/what-s-new-in-integration-services-in-sql-server-2017",
1306+
"redirect_document_id": true
1307+
},
1308+
{
1309+
"source_path": "docs/analysis-services/what-s-new-in-sql-server-analysis-services-vnext.md",
1310+
"redirect_url": "/sql/analysis-services/what-s-new-in-sql-server-analysis-services-2017",
1311+
"redirect_document_id": true
1312+
},
12881313
{
12891314
"source_path": "docs/whitepapers/introducing-microsoft-technologies-for-data-storage-movement-and-transformation.md",
12901315
"redirect_url": "sql/whitepapers/microsoft-white-papers",
12911316
"redirect_document_id": true
1317+
},
1318+
{
1319+
"source_path": "docs/analysis-services/troubleshooting-analysis-services.md",
1320+
"redirect_url": "/sql/analysis-services/analysis-services",
1321+
"redirect_document_id": true
1322+
},
1323+
{
1324+
"source_path": "docs/analysis-services/troubleshoot-process-data-ssas-tabular.md",
1325+
"redirect_url": "/sql/analysis-services/analysis-services",
1326+
"redirect_document_id": true
1327+
},
1328+
{
1329+
"source_path": "docs/analysis-services/troubleshoot-a-power-pivot-for-sharepoint-installation.md",
1330+
"redirect_url": "/sql/analysis-services/analysis-services",
1331+
"redirect_document_id": true
12921332
}
12931333
]
12941334
}

docs/ado/TOC.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
# [ADO Development Guide +](./guide/index.md)
2-
# [ADO Reference +](./reference/index.md)
1+
# [ADO Development Guide +](./guide/ado-programmer-s-guide.md)
2+
# [ADO Reference +](./reference/ado-glossary.md)
33

44
# [Microsoft ActiveX Data Objects (ADO)](microsoft-activex-data-objects-ado.md)
55
# [ADO Glossary](ado-glossary.md)
Lines changed: 86 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
2-
title: "Perform Chunking Analysis using rxDataStep (Data Science Deep Dive) | Microsoft Docs"
2+
title: "Perform Chunking Analysis using rxDataStep| Microsoft Docs"
33
ms.custom: ""
4-
ms.date: "10/03/2016"
4+
ms.date: "05/03/2017"
55
ms.prod: "sql-server-2016"
66
ms.reviewer: ""
77
ms.suite: ""
@@ -19,115 +19,111 @@ author: "jeannt"
1919
ms.author: "jeannt"
2020
manager: "jhubbard"
2121
---
22-
# Lesson 3-3 - Perform Chunking Analysis using rxDataStep
23-
The *rxDataStep* function can be used to process data in chunks, rather than requiring that the entire dataset be loaded into memory and processed at one time, as in traditional R. The way it works is that you read the data in chunks and use R functions to process each chunk of data in turn, and then write the summary results for each chunk to a common [!INCLUDE[ssNoVersion](../../includes/ssnoversion-md.md)] data source.
24-
25-
In this lesson, you'll practice this technique by using the *table* function in R to compute a contingency table.
26-
27-
> [!TIP]
28-
> This example is meant for instructional purposes only. If you need to tabulate real-world data sets, we recommend that you use the *rxCrossTabs* or *rxCube* functions built into **RevoScaleR**, which are optimized for this sort of operation.
29-
30-
## Partition Data by Values
31-
32-
1. First, create a custom R function named *ProcessChunk* that calls the *table* function on each chunk of data.
33-
34-
```R
35-
ProcessChunk <- function( dataList) {
36-
# Convert the input list to a data frame and compute contingency table
37-
chunkTable <- table(as.data.frame(dataList))
38-
39-
# Convert table output to a data frame with a single row
40-
varNames <- names(chunkTable)
41-
varValues <- as.vector(chunkTable)
42-
dim(varValues) <- c(1, length(varNames))
43-
chunkDF <- as.data.frame(varValues)
44-
names(chunkDF) <- varNames
45-
46-
# Return the data frame, which has a single row
47-
return( chunkDF )
48-
}
49-
```
50-
51-
52-
2. Set the compute context to the server.
22+
# Perform Chunking Analysis using rxDataStep
23+
24+
The **rxDataStep** function can be used to process data in chunks, rather than requiring that the entire dataset be loaded into memory and processed at one time, as in traditional R. The way it works is that you read the data in chunks and use R functions to process each chunk of data in turn, and then write the summary results for each chunk to a common [!INCLUDE[ssNoVersion](../../includes/ssnoversion-md.md)] data source.
25+
26+
In this lesson, you'll practice this technique by using the `table` function in R, to compute a contingency table.
27+
28+
> [!TIP]
29+
> This example is meant for instructional purposes only. If you need to tabulate real-world data sets, we recommend that you use the **rxCrossTabs** or **rxCube** functions in **RevoScaleR**, which are optimized for this sort of operation.
30+
31+
## Partition Data by Values
32+
33+
1. First, create a custom R function that calls the *table* function on each chunk of data, and name it `ProcessChunk`.
34+
35+
```R
36+
ProcessChunk <- function( dataList) {
37+
# Convert the input list to a data frame and compute contingency table
38+
chunkTable <- table(as.data.frame(dataList))
39+
40+
# Convert table output to a data frame with a single row
41+
varNames <- names(chunkTable)
42+
varValues <- as.vector(chunkTable)
43+
dim(varValues) <- c(1, length(varNames))
44+
chunkDF <- as.data.frame(varValues)
45+
names(chunkDF) <- varNames
46+
47+
# Return the data frame, which has a single row
48+
return( chunkDF )
49+
}
50+
```
51+
52+
2. Set the compute context to the server.
5353

54-
```R
55-
rxSetComputeContext( sqlCompute )
56-
```
54+
```R
55+
rxSetComputeContext( sqlCompute )
56+
```
5757

58-
3. You'll define a SQL Server data source to hold the data you're processing. Start by assigning a SQL query to a variable.
58+
3. You'll define a SQL Server data source to hold the data you're processing. Start by assigning a SQL query to a variable.
5959

60-
```R
61-
dayQuery <- "SELECT DayOfWeek FROM AirDemoSmallTest"
62-
```
60+
```R
61+
dayQuery <- "SELECT DayOfWeek FROM AirDemoSmallTest"
62+
```
6363

64-
4. Plug that variable into the *sqlQuery* argument of a new [!INCLUDE[ssNoVersion](../../includes/ssnoversion-md.md)] data source.
65-
66-
```R
67-
inDataSource <- RxSqlServerData(sqlQuery = dayQuery,
68-
connectionString = sqlConnString,
69-
rowsPerRead = 50000,
70-
colInfo = list(DayOfWeek = list(type = "factor",
71-
levels = as.character(1:7))))
72-
```
64+
4. Plug that variable into the *sqlQuery* argument of a new [!INCLUDE[ssNoVersion](../../includes/ssnoversion-md.md)] data source.
65+
66+
```R
67+
inDataSource <- RxSqlServerData(sqlQuery = dayQuery,
68+
connectionString = sqlConnString,
69+
rowsPerRead = 50000,
70+
colInfo = list(DayOfWeek = list(type = "factor",
71+
levels = as.character(1:7))))
72+
```
7373
If you ran *rxGetVarInfo* on this data source, you'd see that it contains just the single column: *Var 1: DayOfWeek, Type: factor, no factor levels available*
7474
75-
5. Before applying this factor variable to the source data, create a separate table to hold the intermediate results. Again, you just use the *RxSqlServerData* function to define the data, and delete any existing tables of the same name.
76-
77-
```R
78-
iroDataSource = RxSqlServerData(table = "iroResults", connectionString = sqlConnString)
79-
# Check whether the table already exists.
80-
if (rxSqlServerTableExists(table = "iroResults", connectionString = sqlConnString)) { rxSqlServerDropTable( table = "iroResults", connectionString = sqlConnString) }
81-
```
75+
5. Before applying this factor variable to the source data, create a separate table to hold the intermediate results. Again, you just use the RxSqlServerData function to define the data, and delete any existing tables of the same name.
8276
83-
7. Now you'll call the custom function *ProcessChunk* function to transform the data as it is read, by using it as the *transformFunc* argument to the *rxDataStep* function.
77+
```R
78+
iroDataSource = RxSqlServerData(table = "iroResults", connectionString = sqlConnString)
79+
# Check whether the table already exists.
80+
if (rxSqlServerTableExists(table = "iroResults", connectionString = sqlConnString)) { rxSqlServerDropTable( table = "iroResults", connectionString = sqlConnString) }
81+
```
8482
85-
```R
86-
rxDataStep( inData = inDataSource, outFile = iroDataSource, transformFunc = ProcessChunk, overwrite = TRUE)
87-
```
83+
7. Now you'll call the custom function `ProcessChunk` to transform the data as it is read, by using it as the *transformFunc* argument to the rxDataStep function.
8884

89-
8. To view intermediate results of *ProcessChunk*, assign the results of *rxImport* to a variable, and then output the results to the console.
85+
```R
86+
rxDataStep( inData = inDataSource, outFile = iroDataSource, transformFunc = ProcessChunk, overwrite = TRUE)
87+
```
9088

91-
```R
92-
iroResults <- rxImport(iroDataSource)
89+
8. To view intermediate results of `ProcessChunk`, assign the results of rxImport to a variable, and then output the results to the console.
9390

94-
iroResults
95-
```
91+
```R
92+
iroResults <- rxImport(iroDataSource)
93+
iroResults
94+
```
9695

9796
**Partial results**
9897

9998
| | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
10099
| --- | --- | --- | --- | --- | --- | --- | --- |
101100
| 1 | 8228 | 8924 | 6916 | 6932 | 6944 | 5602 | 6454 |
102101
| 2 | 8321 | 5351 | 7329 | 7411 | 7409 | 6487 | 7692 |
103-
104-
9. To compute the final results across all chunks, sum the columns, and display the results in the console.
105-
106-
```R
107-
finalResults <- colSums(iroResults)
108-
109-
finalResults
110-
```
102+
103+
9. To compute the final results across all chunks, sum the columns, and display the results in the console.
104+
105+
```R
106+
finalResults <- colSums(iroResults)
107+
finalResults
108+
```
109+
111110
**Results**
112111
1 | 2 | 3 | 4 | 5 | 6 | 7
113112
--- | --- | --- | --- | --- | --- | ---
114113
97975 | 77725 | 78875 | 81304 | 82987 | 86159 | 94975
114+
115+
10. To remove the intermediate results table, make another call to rxSqlServerDropTable.
115116

116-
10. To remove the intermediate results table, make another call to *rxSqlServerDropTable*.
117-
118-
```R
119-
rxSqlServerDropTable( table = "iroResults", connectionString = sqlConnString)
120-
```
121-
122-
## Next Step
123-
[Lesson 4: Analyze Data in Local Compute Context &#40;Data Science Deep Dive&#41;](../../advanced-analytics/r-services/lesson-4-analyze-data-in-local-compute-context-data-science-deep-dive.md)
124-
125-
## Previous Step
126-
[Create New SQL Server Table using rxDataStep &#40;Data Science Deep Dive&#41;](../../advanced-analytics/r-services/lesson-3-2-create-new-sql-server-table-using-rxdatastep.md)
127-
128-
## See Also
129-
[Data Science Deep Dive: Using the RevoScaleR Packages](../../advanced-analytics/r-services/data-science-deep-dive-using-the-revoscaler-packages.md)
130-
131-
132-
117+
```R
118+
rxSqlServerDropTable( table = "iroResults", connectionString = sqlConnString)
119+
```
120+
121+
## Next Step
122+
123+
[Analyze Data in Local Compute Context;](../../advanced-analytics/tutorials/deepdive-analyze-data-in-local-compute-context.md)
124+
125+
## Previous Step
126+
127+
[Create New SQL Server Table using rxDataStep](../../advanced-analytics/tutorials/deepdive-create-new-sql-server-table-using-rxdatastep.md)
128+
133129

docs/analysis-services/TOC.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,4 @@
119119

120120
# Reference
121121
## [PowerShell](powershell/index.md)
122-
## [Server properties](server-properties/index.md)
123-
## [Troubleshooting Analysis Services](troubleshooting-analysis-services.md)
124-
### [Troubleshoot process data](troubleshoot-process-data-ssas-tabular.md)
125-
### [Troubleshoot a Power Pivot for SharePoint installation](troubleshoot-a-power-pivot-for-sharepoint-installation.md)
122+
## [Server properties](server-properties/index.md)
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
---
2+
redirect_url: /sql/analysis-services/multidimensional-models/mdx/multidimensional-model-data-access-analysis-services-multidimensional-data
3+
---

docs/analysis-services/troubleshoot-a-power-pivot-for-sharepoint-installation.md

Lines changed: 0 additions & 42 deletions
This file was deleted.

0 commit comments

Comments
 (0)