Replace NaN with Empty String in Pandas : Various Methods

Replace NaN with Empty String in Pandas

Pandas allow you to create and manipulate dataframe. You can easily convert any dataset to pandas dataframe using the pd.DataFrame() constructor. Sometimes datasets contain NaN values that are Not a Number. So, how you can remove or replace it? In this tutorial, you will learn the various methods to replace NaN with an empty string or blank string.

Step to replace NaN with Empty String

Let’s know all the steps required to replace NaN with an empty string in pandas.

Step 1: Import all the necessary library

The first step is to import all the required libraries used for the demonstration. In this tutorial, I am using pandas and numpy package. Let’s import them using the import statement.

import pandas as pd
import numpy as np

Step 2: Create a sample dataframe

The second step is to create a sample dataframe that will contain some NaN values in particular columns or rows. You can create NaN value using the numpy. nan.

Execute the below lines of code to create a sample dataframe.

import pandas as pd
import numpy as np
data = {"name":["Rob",np.nan,"Rob",np.nan],"age":[23,np.nan,23,25],
"country":["USA","UK","USA",np.nan]}
df = pd.DataFrame(data)
df

Output

Sample dataframe to replace NaN with empty string
Sample dataframe to replace NaN with empty string

Step 2: Use the methods to Replace NaN with Empty String in Pandas

In this section, you will know all the methods that will be used to replace NaN with an empty string.

Method 1: Using the fillna() function

The first method is the use of the fillna() function. It will replace all the NaN values present in the entire dataframe with an empty string at once.

It accepts two parameters one is your dataframe and the other is the inplace=True.

Run the below lines of code toe replace NaN.

import pandas as pd
import numpy as np
data = {"name":["Rob",np.nan,"Rob",np.nan],"age":[23,np.nan,23,25],
"country":["USA","UK","USA",np.nan]}
df = pd.DataFrame(data)
df.fillna(value='', inplace=True)
df

Output

replacing NaN using the fillna function
replacing NaN using the fillna function

Method 2:  Use the replace() function

You can also replace the NaN with an empty string using the replace() function. Here you will pass three parameters. One is the np.nan, the other is the empty string and the third is the regex=True to find the NaN value in the dataframe.

Execute the below lines of code to replace NaN.

import pandas as pd
import numpy as np
data = {"name":["Rob",np.nan,"Rob",np.nan],"age":[23,np.nan,23,25],
"country":["USA","UK","USA",np.nan]}
df = pd.DataFrame(data)
df = df.replace(np.nan, '', regex=True)
df

Output

replacing NaN using the replace function
replacing NaN using the replace function

Method 3: Using applymap()  with lambda function

In this method, you will use the applymap() with lambda function as an argument that will find the NaN value in the dataframe and if it is found then the NaN will be replaced by the empty string.

Use the below lines of code to replace NaN.

import pandas as pd
import numpy as np
data = {"name":["Rob",np.nan,"Rob",np.nan],"age":[23,np.nan,23,25],
"country":["USA","UK","USA",np.nan]}
df = pd.DataFrame(data)
df = df.applymap(lambda x: '' if pd.isna(x) else x)
df

Output

replacing NaN using the applymap function
replacing NaN using the applymap function

Method 4: Replace single column

There can also be a case when NaN is present in only a particular column. Therefore it’s unfair to replace the NaN value on the entire dataframe. Using the same replace() function you can replace NaN with an empty string on a single column.

Let’s Replace NaN for the name column with an empty string. Run the below lines of code to achieve that.

import pandas as pd
import numpy as np
data = {"name":["Rob","Bam",np.nan,"Rahul"],"age":[23,25,23,32],
"country":["USA","UK","USA","Germany"]}
df = pd.DataFrame(data)
df['name'].replace(np.nan, '', inplace=True)
df

Output

replacing NaN on single column using the replace function
replacing NaN on single column using the replace function

In the same way, you can use the fillna() function on the name column.

Conclusion

It’s better to remove or replace NaN from the dataframe especially when you try to convert the datasets to dataframe. If you will not do this then the machine learning or deep learning module cannot be accurate and also the normalization of the datasets can be wrong. The above method will replace NaN with an empty string. Now you can add anything on these empty strings.

I hope you have liked this tutorial. If you have any questions you can contact us for more help.

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

Meet Sukesh ( Chief Editor ), a passionate and skilled Python programmer with a deep fascination for data science, NumPy, and Pandas. His journey in the world of coding began as a curious explorer and has evolved into a seasoned data enthusiast.
Share via
 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner