Insert Pandas Dataframe into Mongodb: In 4 Steps Only

Pandas are mostly used Python Packages for Data Manipulation. You can read, write. convert manipulate CSV and data frames easily. Some more tasks it can do are handling of missing values, merging and joining of the two CSV files, time series analysis e.t.c. But one question that is most interesting is how to insert pandas dataframe into Mongodb and this tutorial is entirely on it. You will learn to insert entire data from URL and put it into the MongoDB database through steps.

Step 1: Import the necessary libraries

I am using the Pandas libraries for data manipulation, Quandl for downloading Sensex index data (You can choose any data), Pymongo for doing data manipulation.

import quandl
import pandas as pd
import pymongo
from pymongo import MongoClient

Step 2: Connect the MongoDB

Make a connection to the database using the MongoDB.

# Making a Connection with MongoClient
client = MongoClient("mongodb://localhost:27018/")
# database
db = client["stocks_database"]
# collection
company= db["Company"]

Here I am first connecting the MongoDb Server locally on the port 27018 and then creating a database with the name stocks_database. MongoDB has a collection that is a table and thus creating it as a Company.

Step 3: Get the Stocks Data from Quandl

Quandl has large collection of the financial dataset and I am getting Sensex stock data from there. Please note that Quandl requires an API key for access, therefore you have to first signup there and get your own API key. Here I am leaving blank for security reasons. You have to put your own key. The argument start_date is the date from when you want the records.

data = quandl.get("BSE/SENSEX", authtoken="",start_date = "2019-01-01")

Step 4: Insert the Data into Database

Now you have got the data now you can insert into the MongoDB database. But before it, you have to do convert the data frame into a dictionary (MongoDB uses JSON format data ) and then insert it into the database. The other thing you should note that the Date column is set as Index of the Dataframe, therefore you have to reset the index before inserting. Use the following code.

data.reset_index(inplace=True)
data_dict = data.to_dict("records")
company.insert_one({"index":"Sensex","data":data_dict})

You have successfully inserted the data into the database.

Insert Pandas Dataframe into Mongodb sesex data

How to Load data from MongoDB to pandas dataframe?

In the above steps, you have successfully dumbed the data into the database. Now you can also directly get that stored data from the database and load into the dataframe. Just use the code below.

data_from_db = company.find_one({"index":"Sensex"})
df = pd.DataFrame(data_from_db["data"])
df.set_index("Date",inplace=True)
print(df)

You can see in the above code I am first receiving the data from MongoDB using the find_one() and then converting the data into Dataframe using pandas. After that, I set the “Date” as the index and display it on the screen. You can see the output below.

df of sensex from the database

 

I hope you have understood how to Insert Pandas Dataframe into MongoDB as well as reading it and converting it into Data Frame. If you have any query then contact us for more information. To know more about pandas learn from the Offical Pandas Documentation.