Pandas are mostly used Python Packages for Data Manipulation. You can read, and write. convert manipulate CSV and data frames easily. Some more tasks it can do are handling missing values, merging and joining of the two CSV files, time series analysis e.t.c. But one question that is most interesting is how to insert pandas dataframe into Mongodb and this tutorial is entirely on it. You will learn to insert entire data from the URL and put it into the MongoDB database through steps.
Before moving to the demonstration part make sure that you have installed Pandas in Pycharm. In this entire tutorial, I am using pycharm for coding. That is why I will recommend you to use it only.
Step 1: Import the necessary libraries
I am using the Pandas libraries for data manipulation, Quandl for downloading Sensex index data (You can choose any data), and Pymongo for doing data manipulation.
import quandl
import pandas as pd
import pymongo
from pymongo import MongoClient
Step 2: Connect the MongoDB
Make a connection to the database using MongoDB.
# Making a Connection with MongoClient
client = MongoClient("mongodb://localhost:27018/")
# database
db = client["stocks_database"]
# collection
company= db["Company"]
Here I am first connecting the MongoDb Server locally on port 27018 and then creating a database with the name stocks_database. MongoDB has a collection that is a table and thus creating it as a Company.
Step 3: Get the Stocks Data from Quandl
Quandl has a large collection of the financial dataset and I am getting Sensex stock data from there. Please note that Quandl requires an API key for access, therefore you have to first signup there and get your own API key. Here I am leaving a blank for security reasons. You have to put your own key. The argument start_date is the date from when you want the records.
You can also use your own dataset server. I have chosen Stocks data for demonstration purposes only.
data = quandl.get("BSE/SENSEX", authtoken="",start_date = "2019-01-01")
Step 4: Insert the Data into the Database
Now you have got the data now you can insert it into the MongoDB database. But before it, you have to convert the data frame into a dictionary (MongoDB uses JSON format data ) and then insert it into the database. The other thing you should note is that the Date column is set as the Index of the Dataframe, therefore you have to reset the index before inserting it. Use the following code.
data.reset_index(inplace=True)
data_dict = data.to_dict("records")
company.insert_one({"index":"Sensex","data":data_dict})
You have successfully inserted the data into the database.
How to Load data from MongoDB to pandas dataframe?
In the above steps, you have successfully dumbed the data into the database. Now you can also directly get that stored data from the database and load it into the dataframe. Just use the code below.
data_from_db = company.find_one({"index":"Sensex"})
df = pd.DataFrame(data_from_db["data"])
df.set_index("Date",inplace=True)
print(df)
You can see in the above code I am first receiving the data from MongoDB using the find_one() and then converting the data into Dataframe using pandas. After that, I set the “Date” as the index and display it on the screen. You can see the output below.
Export data to CSV from MongoDB using python
You can also export the data you have read from MongoDB. To do you have to use the pandas.to_csv() method. Using the same above code you have to just add one line of code. It will export data to CSV after reading from MongoDB.
data_from_db = company.find_one({"index":"Sensex"})
df = pd.DataFrame(data_from_db["data"])
df.set_index("Date",inplace=True)
df.to_csv("Sensex.csv")
Here “Sensex.csv’ is the filename for the CSV file you want to export. Please note that the file will be exported to the same working directory until you specifically define the path.
Export data to JSON from MongoDB
You can also fetch data from the MongoDB database and then convert it to JSON using dataframe.to_json() function. Just put the fetched dataframe inside the to_json() method. and it will convert it to JSON. Now you can send the JSON data through endpoints.
I hope you have understood how to Insert Pandas Dataframe into MongoDB as well as reading it and converting it into Data Frame. If you have any queries then contact us for more information. To know more about pandas learn from the Offical Pandas Documentation.
Join our list
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.