Pandas is the best Python library for manipulating large datasets. All the manipulation is done by the inbuilt functions provided by the panda’s packages. Do you know you can insert data frame or dataset into the database using the pandas library? Yes, you can. In this entire tutorial, you will know how to implement the pandas to_sql() method in python with steps.
Syntax of the Pandas to_sql() method
Before going to the coding demonstration part let’s know the syntax of the pandas.dataframe.to_sql() method. Below is it.
DataFrame.to_sql(name, con, schema=None, if_exists='fail', index=True, index_label=None, chunksize=None, dtype=None, method=None)
Explanation of the parameters
name: It is the name of the table.
con: Database Connection.
schema: Specify the schema (if the database supports this). If None, use default schema.
if_exits: It accepts fail, replace, append. If the table is already present in the database then it can do these things.
index: Write the DataFrame index as a column. Uses index_label as the column name in the table.
These are the most used parameters. An explanation of the other parameters is given in the Pandas Offical Documentation.
Steps to Implement Pandas to_sql()
In this section, you will know all the steps required for implementing the to_sql() method. Just follow all the steps given here for deep understanding.
Step 1: Import all the required libraries
The first step is to import all the necessary libraries required in this example. In my example, I am using pandas and sqlite3 package only. sqlite2 package comes default with python module. It is a database package that helps you to create a local database and table. I am using it for demonstration only. However, you can use your own database framework.
import pandas as pd from sqlite3 import connect
Step 2: Create a Database
The next step is to create a database. You will use the connect(“your_databasename.db”) to connect to a database. Let’s create the database “stocks.db“. Add the following line of code.
conn = connect("stocks.db")
Step 3: Read the data
The second step is to read the dataset. In my example, I am using Facebook Stock Time Series data. It is a CSV file and to read a CSV file in pandas you will use the pandas.read_csv() method. You can download recent price data from Yahoo Finance. However, you can use your own dataset. Let’s read the dataset.
data = pd.read_csv("FB.csv")
Step 4: Insert the dataset
Now the final step is to insert all the rows and columns of the dataset into the table we will create inside the database. To do so I will use the dataframe.to_sql() pandas method. Run the following code the insert data to the table.
That’s all you have to do to implement the pandas.dataframe.to_sql() method.
You can also query the table using the pandas.read_sql() method. You have to just pass a SQL query and connection for the database as arguments to implement. For example, I want to query the close of the Facebook stock price then I will pass SQL for it and the connection of the database.
Run the full code given below and see the output.
import pandas as pd from sqlite3 import connect conn = connect("stocks.db") data = pd.read_csv("FB.csv") data.to_sql("meta",conn) df= pd.read_sql("SELECT Close FROM meta ",conn) print(df)
These are steps to implement the pandas to_sql() method. I have used a simple dataset for demonstration purposes. But you can use the same steps for your own dataset. I hope you have liked this tutorial. If you have any queries then you can contact us for more help.
Join our list
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.