Valueerror: can only compare identically-labeled dataframe objects error occurs because of the index mismatch in the dataframe. In this article, we will explore how we can fix this indexing problem and compare two different dataframe. For this, we will create two sample dataframes. Then we will compare the objects.
Here is the code.
import pandas as pd
sample_df1 = pd.DataFrame([[11, 12], [11, 14]])
sample_df2 = pd.DataFrame([[11, 14], [11, 22]], index=[1, 0])
sample_df1 == sample_df2
In the above code, we have created the dataframe with some dummy values. Here is a twist in the second dataframe the index is not automatic. we have manually assigned some indexes. Hence when we run this code, both the indexing will not match and it will generate the error. Let’s run and check the hypothesis.
There are two types of solutions. In the first solution, we will drop off sort the indexes and then compare. The second one is two straight but less informative in nature. In the second method, we will get only the results that are both dataframe entirely match or not. On the opposite side, the first method will give an in-depth comparison of data values.
In this implementation, we will use the reset_index() function. It will drop the index for both dataframe.
print(sample_df1.reset_index(drop=True) == sample_df2.reset_index(drop=True))
Let’s run this reset_index() function.
Here we can see, this mechanism compares data value by value.
In this way, we can compare two dataframe with different indexes but it will show high-level information. I mean either matching or not completely.
I hope now you have a clear concept of matching two different data frames. Please subscribe to us for similar articles.
Thanks
Data Science Learner Team