|
| 1 | +--- |
| 2 | +title: Insert Python dataFrame into SQL Server |
| 3 | +description: How to insert data from a dataframe into SQL Server |
| 4 | +author: cawrites |
| 5 | +ms.author: chadam |
| 6 | +ms.date: 07/01/2020 |
| 7 | +ms.topic: conceptual |
| 8 | +ms.prod: sql |
| 9 | +ms.technology: machine-learning |
| 10 | +monikerRange: ">=sql-server-2017||>=sql-server-linux-ver15||=azuresqldb-mi-current||=azuresqldb-current||=sqlallproducts-allversions" |
| 11 | +--- |
| 12 | +# Insert Python dataFrame into SQL Server |
| 13 | +[!INCLUDE[appliesto-ss-asdbmi-xxxx-xxx-md](../../includes/appliesto-ss-asdbmi-xxxx-xxx-md.md)] |
| 14 | + |
| 15 | +This article describes how to insert data into a SQL database from a `pandas` dataframe using the `pyodbc` package in Python. For more information, see the [pyodbc documentation](../../connect/python/pyodbc/python-sql-driver-pyodbc.md). By establishing a connection with SQL Server using Python `pandas`, data can be sent directly to a SQL table. |
| 16 | + |
| 17 | +## Prerequisites: |
| 18 | + |
| 19 | +::: moniker range=">=sql-server-2017||>=sql-server-linux-ver15||=sqlallproducts-allversions" |
| 20 | +* SQL Server. For how to install, see [SQL Server for Windows](../../database-engine/install-windows/install-sql-server.md) or [for Linux](../../linux/sql-server-linux-overview.md). |
| 21 | +::: moniker-end |
| 22 | + |
| 23 | +::: moniker range="=azuresqldb-current||=sqlallproducts-allversions" |
| 24 | +* Azure SQL Database. For how to sign up, see [Azure SQL Database](https://docs.microsoft.com/azure/sql-database/sql-database-get-started-portal) |
| 25 | +::: moniker-end |
| 26 | + |
| 27 | +::: moniker range="=azuresqldb-mi-current||=sqlallproducts-allversions" |
| 28 | +* Azure SQL Managed Instance. For how to sign up, see [Azure SQL Managed Instance](https://docs.microsoft.com/azure/azure-sql/managed-instance/instance-create-quickstart). |
| 29 | + |
| 30 | +* [SQL Server Management Studio](../../ssms/download-sql-server-management-studio-ssms.md) for restoring the sample database to Azure SQL Managed Instance. |
| 31 | +::: moniker-end |
| 32 | + |
| 33 | +* Azure Data Studio. For how to install, see [Azure Data Studio](../../azure-data-studio/what-is.md). |
| 34 | + |
| 35 | + |
| 36 | +## Restore the sample database |
| 37 | + |
| 38 | +The sample database used in this article has been saved to a **.bak** database backup file for you to download and use. |
| 39 | +::: moniker range=">=sql-server-ver15||>=sql-server-linux-ver15||=azuresqldb-current||=sqlallproducts-allversions" |
| 40 | +1. Follow the instructions in [AdventureWorks sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource. |
| 41 | +1. Follow the directions in [Restore a database from a backup file](../../azure-data-studio/tutorial-backup-restore-sql-server.md#restore-a-database-from-a-backup-file) in Azure Data Studio, using these details: |
| 42 | + - Import from the **AdventureWorks.bak** file - you downloaded. |
| 43 | + - Name the target database "AdventureWorks." |
| 44 | +::: moniker-end |
| 45 | +::: moniker range="=azuresqldb-mi-current||=sqlallproducts-allversions" |
| 46 | +1. Follow the instructions in [AdventureWorks sample databases](../../samples/adventureworks-install-configure.md#download-bak-files) to download the correct OLTP version of the AdventureWorks file and restore it as a database. This database will be used as a datasource. |
| 47 | +1. Follow the directions in [Restore a database to a Managed Instance](/azure/sql-database/sql-database-managed-instance-get-started-restore) in SQL Server Management Studio, using these details: |
| 48 | + - Import from the **AdventureWorks.bak** file - you downloaded. |
| 49 | + - Name the target database "AdventureWorks." |
| 50 | +::: moniker-end |
| 51 | + |
| 52 | +You can verify that the restored database exists by querying the **HumanResources.Department** table: |
| 53 | + |
| 54 | +```sql |
| 55 | +USE AdventureWorks; |
| 56 | +SELECT * FROM HumanResources.Department; |
| 57 | +``` |
| 58 | + |
| 59 | +## Install Python packages |
| 60 | + |
| 61 | +Install the following Python packages using [Azure Data Studio notebook with a Python kernel](../../azure-data-studio/notebooks-tutorial-python-kernel.md) |
| 62 | + |
| 63 | + * pyodbc |
| 64 | + * pandas |
| 65 | + |
| 66 | + To install these packages: |
| 67 | + 1. In your Azure Data Studio notebook, select **Manage Packages**. |
| 68 | + 2. In the **Manage Packages** pane, select the **Add new** tab. |
| 69 | + 3. For each of the following packages, enter the package name, click **Search**, then click **Install**. |
| 70 | + |
| 71 | + As an alternative, you can open a **Command Prompt**, change to the installation path for the version of Python you use in Azure Data Studio (for example, `cd %LocalAppData%\Programs\Python\Python37-32`), then run `pip install` for each package. |
| 72 | + |
| 73 | +## Connect to SQL Server using Azure Data Studio |
| 74 | + |
| 75 | +[Connect using Azure Data Studio](../../azure-data-studio/quickstart-sql-server.md). |
| 76 | + |
| 77 | +1. Connect to Adventureworks database to create the new table, HumanResources.DepartmentTest. The SQL table will be used for the dataframe insertion. |
| 78 | + |
| 79 | +```sql |
| 80 | +CREATE TABLE [HumanResources].[DepartmentTest]( |
| 81 | +[DepartmentID] [smallint] NOT NULL, |
| 82 | +[Name] [dbo].[Name] NOT NULL, |
| 83 | +[GroupName] [dbo].[Name] NOT NULL |
| 84 | +) |
| 85 | +GO |
| 86 | +``` |
| 87 | + |
| 88 | +## Create CVS file |
| 89 | +Copy text and save file as department.cvs for dataframe. |
| 90 | + |
| 91 | +```text |
| 92 | +DepartmentID,Name,GroupName, |
| 93 | +1,Engineering,Research and Development, |
| 94 | +2,Tool Design,Research and Development, |
| 95 | +3,Sales,Sales and Marketing, |
| 96 | +4,Marketing,Sales and Marketing, |
| 97 | +5,Purchasing,Inventory Management, |
| 98 | +6,Research and Development,Research and Development, |
| 99 | +7,Production,Manufacturing, |
| 100 | +8,Production Control,Manufacturing, |
| 101 | +9,Human Resources,Executive General and Administration, |
| 102 | +10,Finance,Executive General and Administration, |
| 103 | +11,Information Services,Executive General and Administration, |
| 104 | +12,Document Control,Quality Assurance, |
| 105 | +13,Quality Assurance,Quality Assurance, |
| 106 | +14,Facilities and Maintenance,Executive General and Administration, |
| 107 | +15,Shipping and Receiving,Inventory Management, |
| 108 | +16,Executive,Executive General and Administration |
| 109 | +``` |
| 110 | + |
| 111 | +## Connect to SQL Server using Python |
| 112 | + |
| 113 | +1. Edit the connection string variables 'server','database','username' and 'password' to connect to SQL Server. |
| 114 | + |
| 115 | +2. Edit path for CSV file. |
| 116 | + |
| 117 | +## Load dataframe from CSV file |
| 118 | + |
| 119 | +Use the Python `pandas` package to create a dataframe and load the CSV file. Connect to SQL Server to load dataframe into the new SQL table, HumanResources.DepartmentTest. |
| 120 | +Edit the connection string variables 'server', 'database', 'username', and 'password' to connect to SQL Server. |
| 121 | + |
| 122 | +To create a new notebook: |
| 123 | +1. In Azure Data Studio, select **File**, select **New Notebook**. |
| 124 | +2. In the notebook, select kernel **Python3**, select the **+code**. |
| 125 | +3. Paste code in notebook, select **Run All**. |
| 126 | + |
| 127 | + ```Python |
| 128 | +import pyodbc |
| 129 | +import pandas |
| 130 | +import pandas as pd |
| 131 | +# insert data from csv file into dataframe. |
| 132 | +# working directory for csv file: type "pwd" in Azure Data Studio or Linux |
| 133 | +# working directory in Windows c:\users\username |
| 134 | +df = pd.read_csv("c:\\user\\username\department.csv") |
| 135 | +# Some other example server values are |
| 136 | +# server = 'localhost\sqlexpress' # for a named instance |
| 137 | +# server = 'myserver,port' # to specify an alternate port |
| 138 | +server = 'yourservername' |
| 139 | +database = 'AdventureWorks' |
| 140 | +username = 'sa' |
| 141 | +password = 'yourpassword' |
| 142 | +cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password) |
| 143 | +cursor = cnxn.cursor() |
| 144 | +# Insert Dataframe into SQL Server: |
| 145 | +for index, row in df.iterrows(): |
| 146 | + cursor.execute("INSERT INTO HumanResources.DepartmentTest (DepartmentID,Name,GroupName) values(?,?,?)", row.DepartmentID, row.Name, row.GroupName) |
| 147 | +cnxn.commit() |
| 148 | +cursor.close() |
| 149 | +``` |
| 150 | + |
| 151 | +## Confirm row count in SQL |
| 152 | + |
| 153 | +Execute the SQL statement to confirm the table was successfully loaded with data from the dataframe. |
| 154 | + |
| 155 | +```sql |
| 156 | +SELECT count(*) from HumanResources.DepartmentTest; |
| 157 | +``` |
| 158 | + |
| 159 | +Results |
| 160 | + |
| 161 | +```bash |
| 162 | +(No column name) |
| 163 | +16 |
| 164 | +``` |
0 commit comments