Home > Mobile >  Compare two dataframes by index with unique indexes
Compare two dataframes by index with unique indexes

Time:02-17

I want to compare two dataframes by index using pd.Dataframe.eq() to get a dataframe with true/false values

But the two dataframes have unique indexes, so I mean that df1 contains a index that is not represented in df2 and vice versa.

I am interested if columns 'a' 'b' 'c' of df1 contain the same value (0 or 1 ) of df2, for each row

df1 = pd.DataFrame({'a':[1,1,1,1,1], 'b':[0,1,0,1,0], 'c':[1,0,0,1,1]}, index=['1_1', '1_2', '2_1', '2_2', '2_3'])
df2 = pd.DataFrame({'a':[0,1,1,1,1,0], 'b':[1,0,1,0,1,0], 'c':[1,1,1,1,1,0]}, index=['1_1', '1_2', '1_3', '2_2', '2_3', '2_4'])

df1.eq(df2)

yields

1_1,False,False,True
1_2,True,False,False
1_3,False,False,False
2_1,False,False,False
2_2,True,False,True
2_3,True,False,True
2_4,False,False,False


and it should look like

1_1,False,False,True
1_2,True,False,False
2_2,True,False,True
2_3,True,False,True

I am not quite sure how to approach the issue of unique indexes. I was thinking about merging the dfs but then I stubble at comparing the columns

Thanks for the help

CodePudding user response:

Seems you only want to keep comparison results for indices that exist in both dataframes, and in this case, you can get the common set of indices by

idx = df1.index.intersection(df2.index)

then

df1.loc[idx].eq(df2.loc[idx])

or

df1.eq(df2).loc[idx]
  • Related