Remove Characters from Dataframe in Python : Only 3 Steps

Remove Characters from Dataframe in Python

Pandas dataframe allows you to read datasets in dataframe and manipulate it. Suppose you have some characters in the columns of the dataframe and wants to remove that characters then how you can do so? In this entire tutorial, you will learn how to remove characters from dataframe in python with steps.

Steps to remove characters from dataframe in python

In this section, you will learn all the steps required to process the removal of the characters. Just follow the steps for deep understanding. You should note that all the coding parts are done on google colab. So make sure you should code on it for better understanding. Let’s get started.

Step 1: Import the required library

The first step is to import the necessary libraries. You can import all the libraries using the import statement. Let’s import them. I am using the Pandas package only.

import pandas as pd

Step 2: Create a sample dataframe

For better understanding, we are creating a simple dataframe that will be used to implement this example. This dataframe will contain the characters that you want to remove.

Execute the below lines of code to create the dataframe.

import pandas as pd
import numpy as np
data = {"name":["Rob$","Bam","Maya$","Rahul"],"age":[23,25,26,32],
        "country":["USA","UK","$France","Germany"]}
df = pd.DataFrame(data)
print(df)

Output

Sample dataframe for remove characters in dataframe
Sample dataframe to remove characters in dataframe

Step 3: Use the methods to remove characters from dataframe python

After the creation let’s apply all the methods to remove characters from the dataframe.

Method 1: Remove characters from some columns

Suppose you have multiple columns that contain certain characters ($) to be removed. Then you will define all the columns you want to remove inside the square bracket. After that replace the string using the replace() function.

Run the below lines of code to remove characters from the dataframe.

import pandas as pd
import numpy as np
data = {"name":["Rob$","Bam","Maya$","Rahul"],"age":[23,25,26,32],
        "country":["USA","UK","$France","Germany"]}
df = pd.DataFrame(data)
check_columns = ["name","country"]
df.replace({"$":""},regex=True)
print(df)

Here I have passed the character to replace with regex= True as an argument for the replace() function. It will check for the string in the columns and replace “$” with blank.

Output

 name age country
0 Rob 23 USA
1 Bam 25 UK
2 Maya 26 France
3 Rahul 32 Germany

Method 2:  Remove characters from entire dataframe

Lets you want to remove characters from the entire dataframe python. Then in this case you do not have to use replace() function on some specific column. But instead, you have to use replace() on the entire dataframe.

Run the below lines of code to remove characters from the entire dataframe.

import pandas as pd
import numpy as np
data = {"name":["Rob","Bam","Maya","Rahul"],"age":[23,25,26,32],
        "country":["USA","UK","France","Germany"]}
df = pd.DataFrame(data)
check_columns = ["name","country"]
df.replace({"$":""},regex=True)
print(df)

Output

 name age country
0 Rob 23 USA
1 Bam 25 UK
2 Maya 26 France
3 Rahul 32 Germany

Conclusion

Sometimes you get unnecessary characters in the dataframe while reading datasets using the pandas package. Thus it’s required to remove them. The above steps will help you to remove characters from dataframe in python.

I hope you have liked this tutorial. If you have any queries then you can contact us for more help.

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner