Pandas is a python package that allows to creation a dataframe from the dataset. Once you have created the dataframe then you can manipulate this dataset easily. You can also change the shape or dimension of the dataset. Flattening the dataframe is one of them. In this entire tutorial, you will know how to flatten pandas‘ dataframe using various methods.
What is flattening dataframe pandas mean
Flattening pandas dataframe means changing the shape of the dataframe to one dimension. You can say changing the values of the datasets to the list. Suppose you have to pandas dataframe with two-column and its four corresponding rows or records. Then all the columns and rows will flatten to a single list.
For example, If I will take the following input dataframe then the output will be like the below.
import pandas as pd
data = {"col1":["a","b","c","d"],"col2":[10,20,30,40]}
df = pd.DataFrame(data)
print(df)
Output
Method to Flatten dataframe pandas
In this entire section, you will know the various methods or ways to flatten a dataframe. Before going to the coding demonstration make sure to install NumPy and pandas in your system.
I will take the same input described in the previous section.
Solution 1: Flatten pandas dataframe using Numpy
The first method to flatten the pandas dataframe is through NumPy python package. There is a function in NumPy that is numpy.flatten() that perform this task.
First, you have to convert the dataframe to numpy using the to_numpy() method and then apply the flatten() method.
Execute the below lines of code to flatten the dataframe.
import pandas as pd
data = {"col1":["a","b","c","d"],"col2":[10,20,30,40]}
df = pd.DataFrame(data)
print("Original Dataframe\n")
print(df,"\n")
print("Flatten Dataframe\n")
flatten_df = df.to_numpy().flatten()
print(flatten_df)
Output
Solution 2: Using the Pandas Dataframe stack method
You can also use the panda’s stack method to flatten the dataframe. It returns the Stacked dataframe or series.
Run the following lines of code to flatten the dataframe.
import pandas as pd
data = {"col1":["a","b","c","d"],"col2":[10,20,30,40]}
df = pd.DataFrame(data)
print("Original Dataframe\n")
print(df,"\n")
print("Flatten Dataframe\n")
flatten_df = df.stack().values
print(flatten_df)
Output
Solution 3: Using the NumPy reshape method
The third method to flatten the pandas dataframe is the NumPy reshape method. You have passed the -1 as an argument for the numpy.reshape() method. This method resizes the NumPy array. It will successfully flatten the pandas dataframe.
import pandas as pd
data = {"col1":["a","b","c","d"],"col2":[10,20,30,40]}
df = pd.DataFrame(data)
print("Original Dataframe\n")
print(df,"\n")
print("Flatten Dataframe\n")
flatten_df = df.values.reshape(-1)
print(flatten_df)
Output
Conclusion
Flattening the pandas dataframe requires when you want to do some text processing on the datasets. Using the above method you can easily flatten the dataframe and make n-grams for the strings.
I hope you have liked this tutorial on how to Flatten Dataframe. If you have any questions then you can contact us for more help.
Join our list
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.