Skip to content

Commit cf12e84

Browse files
authored
Merge pull request #15955 from cawrites/new-python-content
New python content
2 parents 9722eb9 + 83074f0 commit cf12e84

5 files changed

Lines changed: 412 additions & 0 deletions

File tree

32.6 KB
Loading
Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
---
2+
title: Insert data from a SQL table into a Python pandas dataframe
3+
titleSuffix: SQL machine learning
4+
description: Learn how to read data from a SQL database and insert it into a pandas dataframe using Python.
5+
author: cawrites
6+
ms.author: chadam
7+
ms.date: 07/01/2020
8+
ms.topic: how-to
9+
ms.prod: sql
10+
ms.technology: machine-learning
11+
monikerRange: ">=sql-server-2017||>=sql-server-linux-ver15||=azuresqldb-mi-current||=azuresqldb-current||=sqlallproducts-allversions"
12+
---
13+
# Insert data from a SQL table into a Python pandas dataframe
14+
15+
This article describes how to insert data from a SQL database a `pandas` dataframe using the `pyodbc` package in Python. The dataframe can be used for further data exploration. For more information, see the [pyodbc documentation](../../connect/python/pyodbc/python-sql-driver-pyodbc.md).
16+
17+
## Prerequisites:
18+
19+
::: moniker range=">=sql-server-2017||>=sql-server-linux-ver15||=sqlallproducts-allversions"
20+
* SQL Server. For how to install, see [SQL Server for Windows](../../database-engine/install-windows/install-sql-server.md) or [for Linux](../../linux/sql-server-linux-overview.md).
21+
::: moniker-end
22+
23+
::: moniker range="=azuresqldb-current||=sqlallproducts-allversions"
24+
* Azure SQL Database. For how to sign up, see [Azure SQL Database](https://docs.microsoft.com/azure/sql-database/sql-database-get-started-portal)
25+
::: moniker-end
26+
27+
::: moniker range="=azuresqldb-mi-current||=sqlallproducts-allversions"
28+
* Azure SQL Managed Instance. For how to sign up, see [Azure SQL Managed Instance](https://docs.microsoft.com/azure/azure-sql/managed-instance/instance-create-quickstart).
29+
30+
* [SQL Server Management Studio](../../ssms/download-sql-server-management-studio-ssms.md) for restoring the sample database to Azure SQL Managed Instance.
31+
::: moniker-end
32+
33+
* Azure Data Studio. For how to install, see [Azure Data Studio](../../azure-data-studio/what-is.md).
34+
35+
## Restore the sample database
36+
37+
The sample database used in this article has been saved to a **.bak** database backup file for you to download and use.
38+
::: moniker range=">=sql-server-ver15||>=sql-server-linux-ver15||=azuresqldb-current||=sqlallproducts-allversions"
39+
1. Follow the instructions in [AdventureWorks sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource.
40+
1. Follow the directions in [Restore a database from a backup file](../../azure-data-studio/tutorial-backup-restore-sql-server.md#restore-a-database-from-a-backup-file) in Azure Data Studio, using these details:
41+
- Import from the **AdventureWorks.bak** file - you downloaded.
42+
- Name the target database "AdventureWorks."
43+
::: moniker-end
44+
::: moniker range="=azuresqldb-mi-current||=sqlallproducts-allversions"
45+
1. Follow the instructions in [AdventureWorks sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource.
46+
1. Follow the directions in [Restore a database to a Managed Instance](/azure/sql-database/sql-database-managed-instance-get-started-restore) in SQL Server Management Studio, using these details:
47+
- Import from the **AdventureWorks.bak** file - you downloaded.
48+
- Name the target database "AdventureWorks."
49+
::: moniker-end
50+
51+
You can verify that the restored database exists by querying the **Person.CountryRegion** table:
52+
53+
```sql
54+
USE AdventureWorks;
55+
SELECT * FROM Person.CountryRegion;
56+
```
57+
58+
## Install Python packages
59+
60+
Install the following Python packages using [Azure Data Studio notebook with a Python kernel](../../azure-data-studio/notebooks-tutorial-python-kernel.md).
61+
62+
* pyodbc
63+
* pandas
64+
65+
To install these packages:
66+
1. In your Azure Data Studio notebook, select **Manage Packages**.
67+
2. In the **Manage Packages** pane, select the **Add new** tab.
68+
3. For each of the following packages, enter the package name, click **Search**, then click **Install**.
69+
70+
As an alternative, you can open a **Command Prompt**, change to the installation path for the version of Python you use in Azure Data Studio (for example, `cd %LocalAppData%\Programs\Python\Python37-32`), then run `pip install` for each package.
71+
72+
## Insert SQL data into dataframe
73+
74+
Use the following script to select data from Person.CountryRegion table and insert it into a dataframe. Edit the connection string variables 'server', 'database', 'username', and 'password' to connect to SQL Server.
75+
76+
To create a new notebook:
77+
1. In Azure Data Studio, select **File**, select **New Notebook**.
78+
2. In the notebook, select kernel **Python3**, select the **+code**.
79+
3. Paste code in notebook, select **Run All**.
80+
81+
```python
82+
import pyodbc
83+
import pandas
84+
import pandas as pd
85+
# Some other example server values are
86+
# server = 'localhost\sqlexpress' # for a named instance
87+
# server = 'myserver,port' # to specify an alternate port
88+
server = 'servername'
89+
database = 'AdventureWorks'
90+
username = 'yourusername'
91+
password = 'databasename'
92+
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
93+
cursor = cnxn.cursor()
94+
# select 26 rows from SQL table to insert in dataframe.
95+
query = "SELECT [CountryRegionCode], [Name] FROM Person.CountryRegion;"
96+
df = pd.read_sql(query, cnxn)
97+
print(df.head(26))
98+
```
99+
100+
**Output**
101+
The `print` command in the preceding script displays the rows of data from the **pandas** dataframe `df`.
102+
103+
```text
104+
CountryRegionCode Name
105+
0 AF Afghanistan
106+
1 AL Albania
107+
2 DZ Algeria
108+
3 AS American Samoa
109+
4 AD Andorra
110+
5 AO Angola
111+
6 AI Anguilla
112+
7 AQ Antarctica
113+
8 AG Antigua and Barbuda
114+
9 AR Argentina
115+
10 AM Armenia
116+
11 AW Aruba
117+
12 AU Australia
118+
13 AT Austria
119+
14 AZ Azerbaijan
120+
15 BS Bahamas, The
121+
16 BH Bahrain
122+
17 BD Bangladesh
123+
18 BB Barbados
124+
19 BY Belarus
125+
20 BE Belgium
126+
21 BZ Belize
127+
22 BJ Benin
128+
23 BM Bermuda
129+
24 BT Bhutan
130+
25 BO Bolivia
131+
```
132+
133+
Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
---
2+
title: Insert Python dataFrame into SQL Server
3+
description: How to insert data from a dataframe into SQL Server
4+
author: cawrites
5+
ms.author: chadam
6+
ms.date: 07/01/2020
7+
ms.topic: conceptual
8+
ms.prod: sql
9+
ms.technology: machine-learning
10+
monikerRange: ">=sql-server-2017||>=sql-server-linux-ver15||=azuresqldb-mi-current||=azuresqldb-current||=sqlallproducts-allversions"
11+
---
12+
# Insert Python dataFrame into SQL Server
13+
[!INCLUDE[appliesto-ss-asdbmi-xxxx-xxx-md](../../includes/appliesto-ss-asdbmi-xxxx-xxx-md.md)]
14+
15+
This article describes how to insert data into a SQL database from a `pandas` dataframe using the `pyodbc` package in Python. For more information, see the [pyodbc documentation](../../connect/python/pyodbc/python-sql-driver-pyodbc.md). By establishing a connection with SQL Server using Python `pandas`, data can be sent directly to a SQL table.
16+
17+
## Prerequisites:
18+
19+
::: moniker range=">=sql-server-2017||>=sql-server-linux-ver15||=sqlallproducts-allversions"
20+
* SQL Server. For how to install, see [SQL Server for Windows](../../database-engine/install-windows/install-sql-server.md) or [for Linux](../../linux/sql-server-linux-overview.md).
21+
::: moniker-end
22+
23+
::: moniker range="=azuresqldb-current||=sqlallproducts-allversions"
24+
* Azure SQL Database. For how to sign up, see [Azure SQL Database](https://docs.microsoft.com/azure/sql-database/sql-database-get-started-portal)
25+
::: moniker-end
26+
27+
::: moniker range="=azuresqldb-mi-current||=sqlallproducts-allversions"
28+
* Azure SQL Managed Instance. For how to sign up, see [Azure SQL Managed Instance](https://docs.microsoft.com/azure/azure-sql/managed-instance/instance-create-quickstart).
29+
30+
* [SQL Server Management Studio](../../ssms/download-sql-server-management-studio-ssms.md) for restoring the sample database to Azure SQL Managed Instance.
31+
::: moniker-end
32+
33+
* Azure Data Studio. For how to install, see [Azure Data Studio](../../azure-data-studio/what-is.md).
34+
35+
36+
## Restore the sample database
37+
38+
The sample database used in this article has been saved to a **.bak** database backup file for you to download and use.
39+
::: moniker range=">=sql-server-ver15||>=sql-server-linux-ver15||=azuresqldb-current||=sqlallproducts-allversions"
40+
1. Follow the instructions in [AdventureWorks sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource.
41+
1. Follow the directions in [Restore a database from a backup file](../../azure-data-studio/tutorial-backup-restore-sql-server.md#restore-a-database-from-a-backup-file) in Azure Data Studio, using these details:
42+
- Import from the **AdventureWorks.bak** file - you downloaded.
43+
- Name the target database "AdventureWorks."
44+
::: moniker-end
45+
::: moniker range="=azuresqldb-mi-current||=sqlallproducts-allversions"
46+
1. Follow the instructions in [AdventureWorks sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource.
47+
1. Follow the directions in [Restore a database to a Managed Instance](/azure/sql-database/sql-database-managed-instance-get-started-restore) in SQL Server Management Studio, using these details:
48+
- Import from the **AdventureWorks.bak** file - you downloaded.
49+
- Name the target database "AdventureWorks."
50+
::: moniker-end
51+
52+
You can verify that the restored database exists by querying the **HumanResources.Department** table:
53+
54+
```sql
55+
USE AdventureWorks;
56+
SELECT * FROM HumanResources.Department;
57+
```
58+
59+
## Install Python packages
60+
61+
Install the following Python packages using [Azure Data Studio notebook with a Python kernel](../../azure-data-studio/notebooks-tutorial-python-kernel.md)
62+
63+
* pyodbc
64+
* pandas
65+
66+
To install these packages:
67+
1. In your Azure Data Studio notebook, select **Manage Packages**.
68+
2. In the **Manage Packages** pane, select the **Add new** tab.
69+
3. For each of the following packages, enter the package name, click **Search**, then click **Install**.
70+
71+
As an alternative, you can open a **Command Prompt**, change to the installation path for the version of Python you use in Azure Data Studio (for example, `cd %LocalAppData%\Programs\Python\Python37-32`), then run `pip install` for each package.
72+
73+
## Connect to SQL Server using Azure Data Studio
74+
75+
[Connect using Azure Data Studio](../../azure-data-studio/quickstart-sql-server.md).
76+
77+
1. Connect to Adventureworks database to create the new table, HumanResources.DepartmentTest. The SQL table will be used for the dataframe insertion.
78+
79+
```sql
80+
CREATE TABLE [HumanResources].[DepartmentTest](
81+
[DepartmentID] [smallint] NOT NULL,
82+
[Name] [dbo].[Name] NOT NULL,
83+
[GroupName] [dbo].[Name] NOT NULL
84+
)
85+
GO
86+
```
87+
88+
## Create CVS file
89+
Copy text and save file as department.cvs for dataframe.
90+
91+
```text
92+
DepartmentID,Name,GroupName,
93+
1,Engineering,Research and Development,
94+
2,Tool Design,Research and Development,
95+
3,Sales,Sales and Marketing,
96+
4,Marketing,Sales and Marketing,
97+
5,Purchasing,Inventory Management,
98+
6,Research and Development,Research and Development,
99+
7,Production,Manufacturing,
100+
8,Production Control,Manufacturing,
101+
9,Human Resources,Executive General and Administration,
102+
10,Finance,Executive General and Administration,
103+
11,Information Services,Executive General and Administration,
104+
12,Document Control,Quality Assurance,
105+
13,Quality Assurance,Quality Assurance,
106+
14,Facilities and Maintenance,Executive General and Administration,
107+
15,Shipping and Receiving,Inventory Management,
108+
16,Executive,Executive General and Administration
109+
```
110+
111+
## Connect to SQL Server using Python
112+
113+
1. Edit the connection string variables 'server','database','username' and 'password' to connect to SQL Server.
114+
115+
2. Edit path for CSV file.
116+
117+
## Load dataframe from CSV file
118+
119+
Use the Python `pandas` package to create a dataframe and load the CSV file. Connect to SQL Server to load dataframe into the new SQL table, HumanResources.DepartmentTest.
120+
Edit the connection string variables 'server', 'database', 'username', and 'password' to connect to SQL Server.
121+
122+
To create a new notebook:
123+
1. In Azure Data Studio, select **File**, select **New Notebook**.
124+
2. In the notebook, select kernel **Python3**, select the **+code**.
125+
3. Paste code in notebook, select **Run All**.
126+
127+
```Python
128+
import pyodbc
129+
import pandas
130+
import pandas as pd
131+
# insert data from csv file into dataframe.
132+
# working directory for csv file: type "pwd" in Azure Data Studio or Linux
133+
# working directory in Windows c:\users\username
134+
df = pd.read_csv("c:\\user\\username\department.csv")
135+
# Some other example server values are
136+
# server = 'localhost\sqlexpress' # for a named instance
137+
# server = 'myserver,port' # to specify an alternate port
138+
server = 'yourservername'
139+
database = 'AdventureWorks'
140+
username = 'sa'
141+
password = 'yourpassword'
142+
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
143+
cursor = cnxn.cursor()
144+
# Insert Dataframe into SQL Server:
145+
for index, row in df.iterrows():
146+
cursor.execute("INSERT INTO HumanResources.DepartmentTest (DepartmentID,Name,GroupName) values(?,?,?)", row.DepartmentID, row.Name, row.GroupName)
147+
cnxn.commit()
148+
cursor.close()
149+
```
150+
151+
## Confirm row count in SQL
152+
153+
Execute the SQL statement to confirm the table was successfully loaded with data from the dataframe.
154+
155+
```sql
156+
SELECT count(*) from HumanResources.DepartmentTest;
157+
```
158+
159+
Results
160+
161+
```bash
162+
(No column name)
163+
16
164+
```

0 commit comments

Comments
 (0)