Pandas Merge on Index : How to merge two dataframes in Python

Suppose you have two datasets and each dataset has a column which is an index column. Now you want to do pandas merge on index column. How to achieve this. In this tutorial, you will learn all the methods to merge pandas dataframe on index.

Steps to implement Pandas Merge on Index

Step 1: Import the required libraries

Here I am using only NumPy, DateTime, and pandas libraries for dataframe creation and merging. Let’s import all of them.

import numpy as np
impot pandas as pd
import datatime

Step 2: Create Dataframes

For the implementation part, We require two dataframes. Let’s create them. Execute the following lines of code to create them.

Dataframe 1

todays_date = datetime.datetime.now().date()
index = pd.date_range(todays_date-datetime.timedelta(10), periods=10, freq='D')

columns = ['A','B']
data = np.array([np.arange(10)]*2).T
df1 = pd.DataFrame(data,index=index, columns=columns

)

Output

Dataframe 1 Creation for merging on index

Dataframe 2

todays_date = datetime.datetime.now().date()
index = pd.date_range(todays_date-datetime.timedelta(10), periods=5, freq='D')

columns = ['C']
data = np.array([np.arange(5)]).T
df2 = pd.DataFrame(data,index=index, columns=columns)

Output

Dataframe 2 Creation for merging on index

Both the dataframes are time-series data with the date as the index. I am not going to explain what the code is doing. Otherwise, this post will become long. I am just creating two dataframes only.

In the next step, you will look at various examples to implement pandas merge on the index.

Step 3: Follow the various examples to do Pandas Merge on Index

EXAMPLE 1: Using the Pandas Merge Method

In pandas, there is a function pandas.merge() that allows you to merge two dataframes on the index. Execute the following code to merge both dataframes df1 and df2.

pd.merge(df1, df2, left_index=True, right_index=True)

Here I am passing four parameters. The first and second parameters are the dataframes to merge. And the third and fourth are left_index and right_index respectively. The left_index uses the index from the left dataframe(df1) and the right_index uses the index from the right dataframe(df2) as the join key. You can read more about the parameters on panda.merge() documentation.

Output

Merging two dataframes using the Pandas.merge() method

EXAMPLE 2: Using the Pandas Join Method

The second method to merge two dataframes is using the pandas.DataFrame.join method. Just use the dot operator on the dataframe you to merge like below.

join_df = df1.join(df2)
join_df

The merged dataframe will also contain NaN values depending upon the df inside the join() method. For example, If I will use the above code then the merged dataframe will also have NaN values. But If I will use df2.join(df1), then the output will be the same as the above Example 1.

Output

Merging two dataframes using the Pandas.join() method

EXAMPLE 3: Pandas Merge on Index using concat() method

Another method to implement pandas merge on index is using the pandas.concat() method. Just pass both the dataframes with the axis value.

pd.concat([df1, df2], axis=1)

Here the axis value tells how to concate values. Like to merge the columns I am setting the axis to 1. Otherwise, for rows, you will use axis =0.

Output

Merging two dataframes using the Pandas.concat() method

In the above figure, you can see NaN values also comes. To remove it you have to use the dropna() method. Run the below code to remove them.

concat_df= pd.concat([df1, df2], axis=1)
concat_df.dropna()

Output

Removing NaN after using the Pandas.concat() method

Conclusion

These are the example of Pandas Merge on Index. The first example was very easy just call the function. But in the second and third examples, there may be NaN values in the merged dataframe. You can remove them using the dropna() method.

Hope you have understood all the above examples. If you have any query then you can contact us on our Offical Facebook Page.

Source:

Pandas Join

Pandas Concat