When you receive a dataset, there may be some NaN values. Pandas Dropna is a useful method that allows you to drop the NaN values of the dataframe.In this entire article, I will show you various examples of dealing with NaN values using drona() Pandas method.
What are the Causes of Missing Data ?
If your datasets contain missing data then the followin are the causes for getting the missing data on your dataset.
Technical Glitches
Sometime systems are unable to propely collect data points and thus it leads to missing entries.
Human Error
If the data points are manually recording then data might be accidently or intenitional skipped. It is the Human Error.
Integration of Data
Suppose you are intergating pr merging datasets then not all datasets might have the same fields or entries.
If you wants to remove the NaN rows then below is the syntax of it.
The syntax for the Pandas Dropna() method
your_dataframe.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)
Parameters explanation
axis : Requires two values 0 and 1. how: any or all value. If you use any, then all NaN rows or columns will be removed. thresh: Require that many non-NA values. subset : Labels along other axis to consider. inplace : Default is False. If it it set to True, then do operation inplace
Steps to Remove NaN from Dataframe using pandas dropna
Step 1: Import all the necessary libraries
In our examples, We are using NumPy for placing NaN values and pandas for creating dataframe. Let’s import them.
import numpy as np
import pandas as pd
Step 2: Create a Pandas Dataframe
In this step, I will first create a pandas dataframe with NaN values. There is a method to create NaN values. And that is numpy.nan. Execute the lines of code given below to create a Pandas Dataframe.
data = {"Date":["12/11/2020","13/11/2020","14/11/2020","15/11/2020","16/11/2020","17/11/2020"],
"Open":[1,2,np.nan,4,5,7],"Close":[5,6,7,8,9,np.nan],"Volume":[np.nan,200,300,400,500,600]}
df = pd.DataFrame(data=data)
Output
Step 3: Remove the NaN values using dropna() method
Now the last step is to remove NaN values from the dataframe. It can be done in many ways. I will show you all the examples that explains more about dropna().
Example 1: Using Simple dropna() method.
If you want to remove all the rows that have at least a single NaN value, then simply pass your dataframe inside the dropna() method.
Run the code given below.
df.dropna()
Output
Example 2: Removing columns with at least one NaN value.
You can remove the columns that have at least one NaN value. To do so you have to pass the axis =1 or “columns”. In our dataframe all the Columns except Date, Open, Close and Volume will be removed as it has at least one NaN value.
df.dropna(axis=1)
Output
Example 3: Remove Rows with all its value NaN.
Sometimes you have also the case where all the values of a row are NaN. And you want to remove only those rows then you can use the how parameter. To explain this example I am modifying the above original dataframe. Copy the code given below to
data = {"Date":["12/11/2020","13/11/2020","14/11/2020","15/11/2020","16/11/2020","17/11/2020"],
"Open":[1,2,np.nan,4,5,7],"Close":[5,6,np.nan,8,9,10],"Volume":[np.nan,200,np.nan,400,500,600]}
df = pd.DataFrame(data=data)
Output
Now if you apply dropna() then you will get the output as below.
df.dropna(how="all")
Output
Example 4: Remove NaN value on Selected column
Suppose I want to remove the NaN value on one or more columns. To do this task you have to pass the list of columns and assign them to the subset parameter. It removes rows that have NaN values in the corresponding columns. I will use the same dataframe that was created in Step 2.
Run the code below
df.dropna(subset=["Open","Volume"])
Output
After removing NaN values from the dataframe you have to finally modify your dataframe. It can be done by passing the inplace =True inside the dropna() method.
df.dropna(inplace=True)
Conclusion
That’s all for now. These are the best examples I have coded for you. I hope you have understood how to remove NaN from your dataset. Even if you have any queries then you can contact us.
Source:
Pandas Dropna Offical Documentation
Join our list
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.