Pandas

Pandas Profiling : Know How to generate Report of Dataframe

Pandas is a great python library for extracting and manipulating datasets. There are many functions that are used to implement it. Suppose you want to quick analysis of the dataset then there is a method for this and it is dataframe.describe(). But you cannot get detailed analysis using this function. There is another function that allows you to perform detailed analysis on pandas dataframe and it is ProfileReport(). In this entire tutorial, you will learn how to implement pandas profiling through steps.

Steps to implement Pandas Profiling

In this section, you will know all the steps to implement examples on deep analysis of the pandas data frame.

Step 1: Install pandas profile module

If you have not installed pandas-profiling on your system them you can install it using the pip command. Run the following command for that.

pip install pandas-profiling

If you have already installed the pandas profiling package then move to the second step.

Step 2: Create a Sample dataset

The second step is to create a dummy dataset where I will show the detailed analysis of the dataframe. However, you can use your dataset. But for simplicity, I will create a simple dataset. Execute the following lines of code to create it.

data = {"name":["Sahi","Abhishek","Rahul","Mani"],"gender":["male","male","male","female"],"age":[20,27,35,16]}
df = pd.DataFrame(data=data)
print(df)
Sample Dataframe for pandas profiling

Step 3: Use Pandas profiling on dataframe

Now you can create a profile report on dataframe. Just pass the dataframe inside the ProfileReport() function. It will generate a report on your dataframe.

Use the following line of code to create it.

profile = ProfileReport(df, title="Pandas Profiling Report")
profile

It will generate reports on your input dataframe.  You will get all details like overview, missing values, variables information e.t.c.

Output

Generating Simple report on dataframe

The above report is generated in memory. But you can save it to an HTML file by using profile.to_file(“your_html_file_name.html”). Let’s save it as  people.html. Add the following line of code.

profile.to_file("people.html")

It will export to the detailed analysis of your dataframe with the name “people.html”.

When you will open the exported HTML file then you will get the output as below.

Html report of dataframe using pandas profiling

Conclusion

Profiling is the best and easy to get the deep details of your dataset. You can also use pandas.describe() function. But ProfileReport() is the best. So, these are steps to implement pandas profiling in python. I hope you have liked this tutorial. If you have any queries then you can contact us for more help.