How to apply pd to_numeric Method in Pandas Dataframe

How to apply pd to_numeric Method in Pandas Dataframe
How to apply pd to_numeric Method in Pandas Dataframe

Pandas Python module allows you to perform data manipulation. It has many functions that manipulate your data. The pd to_numeric( pandas to_numeric) is one of them. In this entire tutorial, you will know how to convert string to int or float in a pandas dataframe using it. All things will be explained step by step.

Steps to Implement pd to_numeric in dataframe

Step 1: Import the required python module.

The first basic step is to import pandas using the import statement. I am also using numpy and datetime module that helps you to create dataframe.

import pandas as pd
import pandas pd
import datetime

Step 2: Create a Sample Dataframe

For the demonstration purpose, I am creating time-series data. Just execute the code below to create dataframe.

todays_date = datetime.datetime.now().date()
index = pd.date_range(todays_date-datetime.timedelta(10), periods=10, freq='D')

columns = ['A','B', 'C']
data = np.array([np.arange(10)]*3).T
df = pd.DataFrame(data,index=index, columns=columns)

Output

Sample Dataframe for Implementing pd to_numeric
Sample Dataframe for Implementing pd to_numeric

Step 3: Add some string to the dataframe.

If you have already mixed string and numeric data in a specific column then you can go to the next step. But if not then follow this step. In this step, I will add some string values in column “C” of the above-created dataframe. It can be done using the df. iloc[].

Execute the following lines of code.

df.iloc[2,2] ="Sahil"
df.iloc[4,2] ="Robin"
df.iloc[8,2] ="DSL"

Now if you will print the output then you will get the dataframe output as below.

Output

Sample Dataframe for after adding some strings
Sample Dataframe for after adding some strings

Step 4: Implement the pd.to_numeric method

Now the last step is to implement pd.to_numeric() function on the created dataframe. There are many cases of it. You will know all of it.

Case 1:  Use of to_numeric() method without any argument

If I will apply the to_numeric() to column “A”, then it will convert all values to numeric.

pd.to_numeric(df["A"])

Output

Applying to_numeric method on Column A
Applying to_numeric method on Column A

Case 2:  Use of to_numeric() method with errors=”ignore” argument

If you directly pass the df[“C”] inside the method with the argument errors=’ignore’, then you will get the entire values of the column as it. Otherwise, you will get the error “ValueError: Unable to parse string “Sahil” at position 2“.

pd.to_numeric(df["C"],errors="ignore")

Output

Applying to_numeric method on Column C with errors = ignore argument
Applying to_numeric method on Column C with errors = ignore argument

Case 2:  Use of to_numeric() method with errors=”coerce” argument

Suppose I want to remove all the strings present in column C. Then I will use the errors=”coerce” argument. It removes all the strings and replaces them with NaN.

pd.to_numeric(df["C"],errors="coerce")

Output

Applying to_numeric method on Column C with errors = coerce argument
Applying to_numeric method on Column C with errors = coerce argument

You can see in the above figure the dtype of the column is float64 which is numeric. But there are also NaN values in the series. You can remove them using the dropna() method. Just execute the lines of code.

series = pd.to_numeric(df["C"],errors="coerce")
series.dropna()

Output

Remove all the NaN values from the series
Remove all the NaN values from the series

Other Examples

Suppose you have a numeric value written as a string. And if you apply a method that only accepts numerical values then you will get “valueerror”. To remove it you have to first convert the string value to numeric. And it can be done using the pd.to_numeric() method. Just run the line of code.

import pandas as pd
data = {"Date":["12/11/2020","13/11/2020","14/11/2020","15/11/2020"],
"Open":[1,2,3,4],"Close":["5",6,"7",8],"Volume":[100,200,300,400]}
df = pd.DataFrame(data=data)
df

Output

Sample Dataframe with the Numerical Value as String
Sample Dataframe with the Numerical Value as String

In the above code 5 and 7 is a strings in the column Close. If I will apply the to_numeric() method on df[“Close”], then I will get the following output.

pd.to_numeric(df["Close"])

Output

Applying to_numeric method on Column with Numeric Value as String
Applying to_numeric method on Column with Numeric Value as String

You can see the dtype is of “int64 for each value of the Close column.

pd to_numeric implementation
pd to_numeric implementation

Conclusion

That’s all for now. These are the cases and examples for applying the pandas to_numeric() function on pandas dataframe. I hope you have understood this tutorial. Even if you have any queries then you can contact us for more information.

Source:

Pandas Offical Documentation

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

Meet Sukesh ( Chief Editor ), a passionate and skilled Python programmer with a deep fascination for data science, NumPy, and Pandas. His journey in the world of coding began as a curious explorer and has evolved into a seasoned data enthusiast.
 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner