Skip to content

Commit 33b3dea

Browse files
committed
Acrolinx and other edits
1 parent 965824e commit 33b3dea

2 files changed

Lines changed: 6 additions & 6 deletions

File tree

docs/machine-learning/r/r-and-data-optimization-r-services.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ monikerRange: ">=sql-server-2016||>=sql-server-linux-ver15||=sqlallproducts-allv
1414
# Performance tuning and data optimization for R
1515
[!INCLUDE [SQL Server 2016 and later](../../includes/applies-to-version/sqlserver2016.md)]
1616

17-
This article discusses performance optimizations for R or Python scripts that run in SQL Server. It also describes methods that you can use to update your R code, both to boost performance and to avoid known issues.
17+
This article discusses performance optimizations for R or Python scripts that run in SQL Server. You can use these methods to update your R code, both to boost performance and to avoid known issues.
1818

1919
## Choosing a compute context
2020

@@ -76,13 +76,13 @@ There are two ways to achieve parallelization with R in SQL Server:
7676

7777
+ **Use \@parallel.** When using the `sp_execute_external_script` stored procedure to run an R script, set the `@parallel` parameter to `1`. This is the best method if your R script does **not** use RevoScaleR functions, which have other mechanisms for processing. If your script uses RevoScaleR functions (generally prefixed with "rx"), parallel processing is performed automatically and you do not need to explicitly set `@parallel` to `1`.
7878

79-
If the R script can be parallelized, and if the SQL query can be parallelized, then the database engine creates multiple parallel processes. The maximum number of processes that can be created is equal to the **max degree of parallelism** (MAXDOP) setting for the instance. All processes then run the same script, but receive only a portion of the data.
79+
If the R script can be parallelized, and if the SQL query can be parallelized, then the database engine creates multiple parallel processes. The maximum number of processes that can be created is equal to the **maximum degree of parallelism** (MAXDOP) setting for the instance. All processes then run the same script, but receive only a portion of the data.
8080

8181
Thus, this method is not useful with scripts that must see all the data, such as when training a model. However, it is useful when performing tasks such as batch prediction in parallel. For more information on using parallelism with `sp_execute_external_script`, see the **Advanced tips: parallel processing** section of [Using R Code in Transact-SQL](../tutorials/quickstart-r-create-script.md).
8282

8383
+ **Use numTasks =1.** When using **rx** functions in a SQL Server compute context, set the value of the _numTasks_ parameter to the number of processes that you would like to create. The number of processes created can never be more than **MAXDOP**; however, the actual number of processes created is determined by the database engine and may be less than you requested.
8484

85-
If the R script can be parallelized, and if the SQL query can be parallelized, then SQL Server creates multiple parallel processes when running the rx functions. The actual number of processes that are created depends on a variety of factors such as resource governance, current usage of resources, other sessions, and the query execution plan for the query used with the R script.
85+
If the R script can be parallelized, and if the SQL query can be parallelized, then SQL Server creates multiple parallel processes when running the rx functions. The actual number of processes that are created depends on a variety of factors. These include resource governance, current usage of resources, other sessions, and the query execution plan for the query used with the R script.
8686

8787
## Query parallelization
8888

@@ -144,7 +144,7 @@ Many RevoScaleR algorithms support parameters to control how the trained model i
144144

145145
When `cube` is set to `TRUE`, the algorithm uses a partitioned inverse, which might be faster and use less memory. If the formula has a large number of variables, the performance gain can be significant.
146146

147-
For additional guidance on optimization of RevoScaleR, see these articles:
147+
For more information on optimization of RevoScaleR, see these articles:
148148

149149
+ Support article: [Performance tuning options for rxDForest and rxDTree](https://support.microsoft.com/kb/3104235)
150150

@@ -168,6 +168,6 @@ We also recommend that you look into the new **MicrosoftML** package, which prov
168168

169169
## Next steps
170170

171-
+ For resources you can use to improve the performance of your R code, see [Use R code profiling functions to improve performance](using-r-code-profiling-functions.md).
171+
+ For R functions you can use to improve the performance of your R code, see [Use R code profiling functions to improve performance](using-r-code-profiling-functions.md).
172172

173173
+ For more complete information about performance tuning on SQL Server, see [Performance Center for SQL Server Database Engine and Azure SQL Database](/sql/relational-databases/performance/performance-center-for-sql-server-database-engine-and-azure-sql-database).

docs/machine-learning/r/using-r-code-profiling-functions.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ monikerRange: ">=sql-server-2016||>=sql-server-linux-ver15||=sqlallproducts-allv
1313
# Use R code profiling functions to improve performance
1414
[!INCLUDE [SQL Server 2016 and later](../../includes/applies-to-version/sqlserver2016.md)]
1515

16-
This article describes performance tools provided by R packages to get information about internal function calls.
16+
This article describes performance tools provided by R packages to get information about internal function calls. You can use this information to improve the performance of your code.
1717

1818
> [!TIP]
1919
> This article provides basic resources to get you started. For expert guidance, we recommend the *Performance* section in ["Advanced R" by Hadley Wickham](http://adv-r.had.co.nz).

0 commit comments

Comments
 (0)