Skip to content

Commit 7127434

Browse files
committed
Merge branch 'master' of https://github.com/MicrosoftDocs/sql-docs-pr into davidph-datatype-factor
2 parents b68cde0 + 052c562 commit 7127434

10 files changed

Lines changed: 90 additions & 104 deletions
76.1 KB
Loading

docs/machine-learning/data-exploration/python-dataframe-pandas.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,14 @@ titleSuffix: SQL machine learning
44
description: Learn how to read data from a SQL database and insert it into a pandas dataframe using Python.
55
author: cawrites
66
ms.author: chadam
7-
ms.date: 07/01/2020
7+
ms.date: 07/14/2020
88
ms.topic: how-to
99
ms.prod: sql
1010
ms.technology: machine-learning
1111
monikerRange: ">=sql-server-2017||>=sql-server-linux-ver15||=azuresqldb-mi-current||=azuresqldb-current||=sqlallproducts-allversions"
1212
---
1313
# Insert data from a SQL table into a Python pandas dataframe
14+
[!INCLUDE[appliesto-ss-xxxx-xxxx-xxx-md](../../includes/applies-to-version/sql-asdb-asdbmi-asa.md)]
1415

1516
This article describes how to insert data from a SQL database a `pandas` dataframe using the `pyodbc` package in Python. The dataframe can be used for further data exploration. For more information, see the [pyodbc documentation](../../connect/python/pyodbc/python-sql-driver-pyodbc.md).
1617

@@ -39,13 +40,13 @@ The sample database used in this article has been saved to a **.bak** database b
3940
1. Follow the instructions in [AdventureWorks sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource.
4041
1. Follow the directions in [Restore a database from a backup file](../../azure-data-studio/tutorial-backup-restore-sql-server.md#restore-a-database-from-a-backup-file) in Azure Data Studio, using these details:
4142
- Import from the **AdventureWorks.bak** file - you downloaded.
42-
- Name the target database "AdventureWorks."
43+
- Name the target database "AdventureWorks".
4344
::: moniker-end
4445
::: moniker range="=azuresqldb-mi-current||=sqlallproducts-allversions"
4546
1. Follow the instructions in [AdventureWorks sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource.
4647
1. Follow the directions in [Restore a database to a Managed Instance](/azure/sql-database/sql-database-managed-instance-get-started-restore) in SQL Server Management Studio, using these details:
4748
- Import from the **AdventureWorks.bak** file - you downloaded.
48-
- Name the target database "AdventureWorks."
49+
- Name the target database "AdventureWorks".
4950
::: moniker-end
5051

5152
You can verify that the restored database exists by querying the **Person.CountryRegion** table:
@@ -71,7 +72,7 @@ To install these packages:
7172

7273
## Insert SQL data into dataframe
7374

74-
Use the following script to select data from Person.CountryRegion table and insert it into a dataframe. Edit the connection string variables 'server', 'database', 'username', and 'password' to connect to SQL Server.
75+
Use the following script to select data from Person.CountryRegion table and insert into a dataframe. Edit the connection string variables: 'server', 'database', 'username', and 'password' to connect to SQL.
7576

7677
To create a new notebook:
7778
1. In Azure Data Studio, select **File**, select **New Notebook**.
@@ -80,7 +81,6 @@ To create a new notebook:
8081

8182
```python
8283
import pyodbc
83-
import pandas
8484
import pandas as pd
8585
# Some other example server values are
8686
# server = 'localhost\sqlexpress' # for a named instance
@@ -129,5 +129,3 @@ CountryRegionCode Name
129129
24 BT Bhutan
130130
25 BO Bolivia
131131
```
132-
133-

docs/machine-learning/data-exploration/python-dataframe-sql-server.md

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
11
---
2-
title: Insert Python dataFrame into SQL Server
3-
description: How to insert data from a dataframe into SQL Server
2+
title: Insert Python dataframe into SQL table
3+
description: How to insert data from a dataframe into SQL table
44
author: cawrites
55
ms.author: chadam
6-
ms.date: 07/01/2020
6+
ms.date: 07/14/2020
77
ms.topic: conceptual
88
ms.prod: sql
99
ms.technology: machine-learning
1010
monikerRange: ">=sql-server-2017||>=sql-server-linux-ver15||=azuresqldb-mi-current||=azuresqldb-current||=sqlallproducts-allversions"
1111
---
12-
# Insert Python dataFrame into SQL Server
13-
[!INCLUDE[appliesto-ss-asdbmi-xxxx-xxx-md](../../includes/appliesto-ss-asdbmi-xxxx-xxx-md.md)]
12+
# Insert Python dataframe into SQL table
13+
[!INCLUDE[appliesto-ss-xxxx-xxxx-xxx-md](../../includes/applies-to-version/sql-asdb-asdbmi-asa.md)]
1414

15-
This article describes how to insert data into a SQL database from a `pandas` dataframe using the `pyodbc` package in Python. For more information, see the [pyodbc documentation](../../connect/python/pyodbc/python-sql-driver-pyodbc.md). By establishing a connection with SQL Server using Python `pandas`, data can be sent directly to a SQL table.
15+
This article describes how to insert data into a SQL database from a `pandas` dataframe using the `pyodbc` package in Python. For more information, see the [pyodbc documentation](../../connect/python/pyodbc/python-sql-driver-pyodbc.md). By establishing a connection with SQL using Python `pandas`, data can be sent directly to a SQL table.
1616

1717
## Prerequisites:
1818

@@ -40,13 +40,13 @@ The sample database used in this article has been saved to a **.bak** database b
4040
1. Follow the instructions in [AdventureWorks sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource.
4141
1. Follow the directions in [Restore a database from a backup file](../../azure-data-studio/tutorial-backup-restore-sql-server.md#restore-a-database-from-a-backup-file) in Azure Data Studio, using these details:
4242
- Import from the **AdventureWorks.bak** file - you downloaded.
43-
- Name the target database "AdventureWorks."
43+
- Name the target database "AdventureWorks".
4444
::: moniker-end
4545
::: moniker range="=azuresqldb-mi-current||=sqlallproducts-allversions"
4646
1. Follow the instructions in [AdventureWorks sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource.
4747
1. Follow the directions in [Restore a database to a Managed Instance](/azure/sql-database/sql-database-managed-instance-get-started-restore) in SQL Server Management Studio, using these details:
4848
- Import from the **AdventureWorks.bak** file - you downloaded.
49-
- Name the target database "AdventureWorks."
49+
- Name the target database "AdventureWorks".
5050
::: moniker-end
5151

5252
You can verify that the restored database exists by querying the **HumanResources.Department** table:
@@ -108,25 +108,23 @@ DepartmentID,Name,GroupName,
108108
16,Executive,Executive General and Administration
109109
```
110110

111-
## Connect to SQL Server using Python
111+
## Connect to SQL using Python
112112

113-
1. Edit the connection string variables 'server','database','username' and 'password' to connect to SQL Server.
113+
1. Edit the connection string variables 'server','database','username' and 'password' to connect to SQL database.
114114

115-
2. Edit path for CSV file.
115+
2. Edit path for CSV file.
116116

117117
## Load dataframe from CSV file
118118

119-
Use the Python `pandas` package to create a dataframe and load the CSV file. Connect to SQL Server to load dataframe into the new SQL table, HumanResources.DepartmentTest.
120-
Edit the connection string variables 'server', 'database', 'username', and 'password' to connect to SQL Server.
119+
Use the Python `pandas` package to create a dataframe and load the CSV file. Connect to SQL to load dataframe into the new SQL table, HumanResources.DepartmentTest.
121120

122121
To create a new notebook:
123122
1. In Azure Data Studio, select **File**, select **New Notebook**.
124123
2. In the notebook, select kernel **Python3**, select the **+code**.
125124
3. Paste code in notebook, select **Run All**.
126125

127126
```Python
128-
import pyodbc
129-
import pandas
127+
import pyodbc
130128
import pandas as pd
131129
# insert data from csv file into dataframe.
132130
# working directory for csv file: type "pwd" in Azure Data Studio or Linux
@@ -137,7 +135,7 @@ df = pd.read_csv("c:\\user\\username\department.csv")
137135
# server = 'myserver,port' # to specify an alternate port
138136
server = 'yourservername'
139137
database = 'AdventureWorks'
140-
username = 'sa'
138+
username = 'username'
141139
password = 'yourpassword'
142140
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
143141
cursor = cnxn.cursor()

docs/machine-learning/data-exploration/python-plot-histogram.md

Lines changed: 14 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,17 @@ title: Plot a histogram for data exploration with Python
33
description: Learn how to create a histogram to visualize data
44
author: cawrites
55
ms.author: chadam
6-
ms.date: 07/01/2020
6+
ms.date: 07/14/2020
77
ms.topic: conceptual
88
ms.prod: sql
99
ms.technology: machine-learning
1010
monikerRange: ">=sql-server-2017||>=sql-server-linux-ver15||=azuresqldb-mi-current||=azuresqldb-current||=sqlallproducts-allversions"
1111
---
1212

1313
# Plot histograms in Python
14-
[!INCLUDE[appliesto-ss-asdbmi-xxxx-xxx-md](../../includes/appliesto-ss-asdbmi-xxxx-xxx-md.md)]
14+
[!INCLUDE[appliesto-ss-xxxx-xxxx-xxx-md](../../includes/applies-to-version/sql-asdb-asdbmi-asa.md)]
1515

16-
This article describes how to plot data using the Python package [Matplotlib](https://matplotlib.org/). A histogram displays data intervals that have consecutive, non-overlapping values.
16+
This article describes how to plot data using the Python package [pandas'.hist()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.hist.html). A SQL database is the source used to visualize the histogram data intervals that have consecutive, non-overlapping values.
1717

1818
## Prerequisites:
1919

@@ -37,21 +37,21 @@ This article describes how to plot data using the Python package [Matplotlib](ht
3737

3838
The sample database used in this article has been saved to a **.bak** database backup file for you to download and use.
3939
::: moniker range=">=sql-server-ver15||>=sql-server-linux-ver15||=azuresqldb-current||=sqlallproducts-allversions"
40-
1. Follow the instructions in [AdventureWorks sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource.
40+
1. Follow the instructions in [AdventureWorksDW sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource.
4141
1. Follow the directions in [Restore a database from a backup file](../../azure-data-studio/tutorial-backup-restore-sql-server.md#restore-a-database-from-a-backup-file) in Azure Data Studio, using these details:
42-
- Import from the **AdventureWorks.bak** file - you downloaded.
43-
- Name the target database "AdventureWorks."
42+
- Import from the **AdventureWorksDW.bak** file - you downloaded.
43+
- Name the target database "AdventureWorks".
4444
::: moniker-end
4545
::: moniker range="=azuresqldb-mi-current||=sqlallproducts-allversions"
46-
1. Follow the instructions in [AdventureWorks sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource.
46+
1. Follow the instructions in [AdventureWorksDW sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource.
4747
1. Follow the directions in [Restore a database to a Managed Instance](/azure/sql-database/sql-database-managed-instance-get-started-restore) in SQL Server Management Studio, using these details:
48-
- Import from the **AdventureWorks.bak** file - you downloaded.
49-
- Name the target database "AdventureWorks."
48+
- Import from the **AdventureWorksDW.bak** file - you downloaded.
49+
- Name the target database "AdventureWorks".
5050
::: moniker-end
5151

5252
You can verify that the restored database exists by querying the **Person.CountryRegion** table:
5353
```sql
54-
USE AdventureWorks;
54+
USE AdventureWorksDW;
5555
SELECT * FROM Person.CountryRegion;
5656
```
5757

@@ -61,7 +61,6 @@ Install the following Python packages using [Azure Data Studio notebook with a P
6161

6262
* pyodbc
6363
* pandas
64-
* matplotlib
6564

6665
To install these packages:
6766
1. In your Azure Data Studio notebook, select **Manage Packages**.
@@ -73,7 +72,7 @@ As an alternative, you can open a **Command Prompt**, change to the installation
7372
## Plot histogram
7473

7574
The distributed data displayed in the histogram is based on a SQL query from AdventureWorksDW. The histogram visualizes data and the frequency of data values.
76-
Edit the connection string variables 'server', 'database', 'username', and 'password' to connect to SQL Server.
75+
Edit the connection string variables: 'server', 'database', 'username', and 'password' to connect to SQL database.
7776

7877
To create a new notebook:
7978
1. In Azure Data Studio, select **File**, select **New Notebook**.
@@ -82,15 +81,12 @@ To create a new notebook:
8281

8382
```python
8483
import pyodbc
85-
import pandas
86-
import pandas as pd
87-
import matplotlib
88-
import matplotlib.pyplot as plt
84+
import pandas as plt
8985
# Some other example server values are
9086
# server = 'localhost\sqlexpress' # for a named instance
9187
# server = 'myserver,port' # to specify an alternate port
9288
server = 'servername'
93-
database = 'AdventureWorks'
89+
database = 'AdventureWorksDW'
9490
username = 'yourusername'
9591
password = 'databasename'
9692
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
@@ -102,6 +98,6 @@ df.hist(bins=10)
10298

10399
The display shows the age distribution of customers in the FactInternetSales table.
104100

105-
![Matplotlib Histogram](./media/python-histogram.png)
101+
![Pandas Histogram](./media/python-histogram.png)
106102

107103

docs/machine-learning/python/ref-py-microsoftml.md

Lines changed: 6 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,22 @@
11
---
22
title: microsoftml Python package
3-
description: Introduces the Microsoft machine learning algorithms and models for Python, as related to SQL Server machine learning workloads.
3+
description: microsoftml is a Python package from Microsoft that provides high-performance machine learning algorithms. It includes functions for training and transformations, scoring, text and image analysis, and feature extraction for deriving values from existing data. The package is included in SQL Server Machine Learning Services.
44
ms.prod: sql
55
ms.technology: machine-learning-services
6-
7-
ms.date: 11/06/2019
6+
ms.date: 07/14/2020
87
ms.topic: how-to
98
author: dphansen
109
ms.author: davidph
1110
monikerRange: ">=sql-server-2017||>=sql-server-linux-ver15||=sqlallproducts-allversions"
1211
---
13-
# microsoftml (Python module in SQL Server)
12+
# microsoftml (Python package in SQL Server Machine Learning Services)
1413
[!INCLUDE [SQL Server](../../includes/applies-to-version/sqlserver.md)]
1514

16-
**microsoftml** is a Python35-compatible module from Microsoft providing high-performance machine learning algorithms. It includes functions for training and transformations, scoring, text and image analysis, and feature extraction for deriving values from existing data.
17-
18-
The machine learning APIs were developed by Microsoft for internal machine learning applications and have been refined over the years to support high performance on big data, using multicore processing and fast data streaming. This package originated as a Python equivalent of an R version, [MicrosoftML](../r/ref-r-microsoftml.md), that has similar functions.
15+
**microsoftml** is a Python package from Microsoft that provides high-performance machine learning algorithms. It includes functions for training and transformations, scoring, text and image analysis, and feature extraction for deriving values from existing data. The package is included in [SQL Server Machine Learning Services](../sql-server-machine-learning-services.md) and supports high performance on big data, using multicore processing, and fast data streaming.
1916

2017
## Full reference documentation
2118

22-
The **microsoftml** library is distributed in multiple Microsoft products, but usage is the same whether you get the library in SQL Server or another product. Because the functions are the same, [documentation for individual microsoftml functions](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/microsoftml-package) is published to just one location under the [Python reference](https://docs.microsoft.com/machine-learning-server/python-reference/introducing-python-package-reference) for Microsoft Machine Learning Server. Should any product-specific behaviors exist, discrepancies will be noted in the function help page.
19+
The **microsoftml** package is distributed in multiple Microsoft products, but usage is the same whether you get the package in SQL Server or another product. Because the functions are the same, [documentation for individual microsoftml functions](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/microsoftml-package) is published to just one location under the [Python reference](https://docs.microsoft.com/machine-learning-server/python-reference/introducing-python-package-reference) for Microsoft Machine Learning Server. Should any product-specific behaviors exist, discrepancies will be noted in the function help page.
2320

2421
## Versions and platforms
2522

@@ -37,7 +34,7 @@ The **microsoftml** module is based on Python 3.5 and available only when you in
3734
Algorithms in **microsoftml** depend on [revoscalepy](ref-py-revoscalepy.md) for:
3835

3936
+ Data source objects. Data consumed by **microsoftml** functions are created using **revoscalepy** functions.
40-
+ Remote computing (shifting function execution to a remote SQL Server instance). The **revoscalepy** library provides functions for creating and activating a remote compute context for SQL server.
37+
+ Remote computing (shifting function execution to a remote SQL Server instance). The **revoscalepy** package provides functions for creating and activating a remote compute context for SQL server.
4138

4239
In most cases, you will load the packages together whenever you are using **microsoftml**.
4340

0 commit comments

Comments
 (0)