Many data analysts removes the rows or columns that have missing values. Do you know you rather than removing the rows or columns you can actually fill with the value using a single function in pandas? And that is pandas interpolate. In this entire tutorial, I will show you how to implement pandas interpolate step by step.

## Steps to implement Pandas Interpolate

### Step 1: Import all the necessary libraries

Let’s import the used libraries. Here In my code, I am using only the NumPy, DateTime, and pandas modules. So I will import them using the import statement. The numpy and datetime module will be used for making the dataset.

```
import numpy as np
import pandas as pd
import datetime
```

### Step 2: Create a Sample Pandas Dataframe

Now the next step is to create a sample dataframe to implement pandas Interpolate. Here I am creating a time-series dataframe that has some NaN values. These values are created using * np. nan*. You will have to interpolate these missing values using the function.

Execute the code below to create a dataframe.

```
todays_date = datetime.datetime.now().date()
index = pd.date_range(todays_date-datetime.timedelta(10), periods=10, freq='D')
columns = ['Price']
data = np.array([10,20,np.nan,30,50,20,np.nan,100,30,np.nan]).T
df = pd.DataFrame(data,index=index, columns=columns)
```

In the above code, I am creating 10 dates and each corresponding date price is determined with some NaN variable. When you will run the above code you will get the output as below.

### Step 3: Apply the pandas interpolate on the dataframe

The last step is to apply the * interpolate()* method on the above-created data frame. If you apply the function then all the NaN values will be replaced by the values.

Execute the code below.

`df.interpolate()`

**Output**

**How does the interpolate do?** Here in the above figure, you can see the NaN value is replaced by the Mean of the previous and next value of the NaN. Except for the last one. There is a method to do so and it is a method argument. The default value for it is * method =”linear”*. There is also another method argument value and it is polynomial.

You will get the same result as the above if you use * method =”linear”*.

`df.interpolate(method="linear")`

**Output**

And if you use the * method=”polynomial” *then you will get a different output.

`df.interpolate(method="polynomial",order=2)`

**Output**

## Conclusion

Pandas interpolate is a very useful method for filling the NaN or missing values. In machine learning removing rows that have missing values can lead to the wrong predictive model. Therefore you can use it to improve your model. I hope you have understood the implementation of the interpolate method. If you have any queries then you can contact us for more information.

Source:

#### Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.