Skip to content

Commit 6051552

Browse files
committed
added step
1 parent 37e138b commit 6051552

4 files changed

Lines changed: 15 additions & 4 deletions

File tree

11.1 KB
Loading

docs/advanced-analytics/tutorials/sqldev-in-database-r-for-sql-developers.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,10 @@ manager: cgronlun
1313
# Tutorial: In-Database R analytics for SQL developers
1414
[!INCLUDE[appliesto-ss-xxxx-xxxx-xxx-md-winonly](../../includes/appliesto-ss-xxxx-xxxx-xxx-md-winonly.md)]
1515

16-
In this tutorial for SQL programmers, you gain hands-on experience using the R language to build and deploy a machine learning solution by wrapping R code in stored procedures.
16+
In this tutorial for SQL programmers, learn about R integration by building and deploying an R-based machine learning solution using a [NYCTaxi_sample](demo-data-nyctaxi-in-sql.md) database on SQL Server.
17+
18+
This tutorial introduces you to R functions used in a data modeling workflow. Steps include data exploration, building and training a binary classification model, and model deployment. You'll use sample data from the New York City Taxi and Limosine Commission, and the model you will build predicts whether a trip is likely to result in a tip based on the time of day, distance travelled, and pick-up location. All of the R code used in this tutorial is wrapped in stored procedures that you create and run in Management Studio.
1719

18-
Using sample data from the New York City Taxi and Limosine Commission, you will build a binary classification model that predicts whether a particular trip is likely to get a tip or not, based on columns such as the time of day, distance, and pick-up location.
1920

2021
> [!NOTE]
2122
>

docs/advanced-analytics/tutorials/sqldev-py5-train-and-save-a-model-using-t-sql.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,16 @@ You load the modules and call the necessary functions to create and train the mo
4848
GO
4949
```
5050

51+
## Add a name column in nyc_taxi_models
52+
53+
Scripts in this tutorial store a model name as a label for generated models. The model name is used in queries to select a revoscalepy or SciKit model.
54+
55+
1. In Management Studio, open the **nyc_taxi_models** table.
56+
57+
2. Right-click **Columns** and click **New Column**. Set the column name to *name*, with a type **nchar(250)**, and allow nulls.
58+
59+
![Name column for storing model names](media/sqldev-python-newcolumn.png)
60+
5161
## Build a logistic regression model
5262

5363
After the data has been prepared, you can use it to train a model. You do this by calling a stored procedure that runs some Python code, taking as input the training data table. For this tutorial, you create two models, both binary classification models:

docs/advanced-analytics/tutorials/sqldev-py6-operationalize-the-model.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,6 @@ import numpy;
5959
from sklearn import metrics
6060
6161
mod = pickle.loads(lmodel2)
62-
6362
X = InputDataSet[["passenger_count", "trip_distance", "trip_time_in_secs", "direct_distance"]]
6463
y = numpy.ravel(InputDataSet[["tipped"]])
6564
@@ -109,12 +108,13 @@ X = InputDataSet[["passenger_count", "trip_distance", "trip_time_in_secs", "dire
109108
y = numpy.ravel(InputDataSet[["tipped"]])
110109
111110
probArray = rx_predict(mod, X)
112-
prob_list = prob_array["tipped_Pred"].values
111+
probList = probArray["tipped_Pred"].values
113112
114113
probArray = numpy.asarray(probList)
115114
fpr, tpr, thresholds = metrics.roc_curve(y, probArray)
116115
aucResult = metrics.auc(fpr, tpr)
117116
print ("AUC on testing data is: " + str(aucResult))
117+
118118
OutputDataSet = pandas.DataFrame(data = probList, columns = ["predictions"])
119119
',
120120
@input_data_1 = @inquery,

0 commit comments

Comments
 (0)