Home > Blockchain >  Pandas: find matching rows in two dataframes (without using `merge`)
Pandas: find matching rows in two dataframes (without using `merge`)

Time:12-24

Let's suppose I have these two dataframes:

tmp = np.arange(0,12).reshape((4,3))
df = pd.DataFrame(data=tmp) 

tmp2 = {'a':[3,100,101], 'b':[4,4,100], 'c':[5,100,3]}
df2 = pd.DataFrame(data=tmp2)

print(df)
   0   1   2
0  0   1   2
1  3   4   5
2  6   7   8
3  9  10  11

print(df2)
     a    b    c
0    3    4    5
1  100    4  100
2  101  100    3

I want to verify if the rows of df2 are matching any rows of df, that is I want to obtain a series (or an array) of boolean values that gives this result:

0     True
1    False
2    False
dtype: bool

I think something like the isin method should work, but I got this result, which results in a dataframe and is wrong:

print(df2.isin(df))
       a      b      c
0  False  False  False
1  False  False  False
2  False  False  False

As a constraint, I wish to not use the merge method, since what I am doing is in fact a check on the data before applying merge itself. Thank you for your help!

CodePudding user response:

You can use Result

  • Related