Home > other >  Checking If Any Of Several Columns In Pandas Are Equal To Any Of Several Others In Pandas
Checking If Any Of Several Columns In Pandas Are Equal To Any Of Several Others In Pandas

Time:10-22

I have a DataFrame in Pandas where I want to check if any of several columns are equal to several others. For example:

mydf = pd.DataFrame({'a1': [1, 2, 3, 4, 5], 'a2': [6, 7, 8, 9, 10], 'a3': [2, 3, 4, 5, 6], 'b1': [1, 7, 2, 3, 4], 'b2': [8, 9, 4, 2, 2], 'b3': [1, 3, 3, 3, 5]})

   a1  a2  a3  b1  b2  b3
0   1   6   2   1   8   1
1   2   7   3   7   9   3
2   3   8   4   2   4   3
3   4   9   5   3   2   3
4   5  10   6   4   2   5

I want to check if any of the 'a' columns are equal to any of the 'b' columns producing this DataFrame:

   a1  a2  a3  b1  b2  b3  x
0   1   6   2   1   8   1  1
1   2   7   3   7   9   3  1
2   3   8   4   2   4   3  1
3   4   9   5   3   2   3  0
4   5  10   6   4   2   5  1

Is there a nice way to do this? I only currently know quite a nasty way with loops. Thanks a lot.

CodePudding user response:

One way is to use list comprehension:

print ([any(set(x)&set(y)) for x, y in zip(df.filter(like="a").values, df.filter(like="b").values)])

[True, True, True, False, True]

CodePudding user response:

We can use numpy broadcasting to compare the values:

a = mydf.filter(like='a').to_numpy()
b = mydf.filter(like='b').to_numpy()

mydf['x'] = (a[None, :].T == b).any(2).any(0)

   a1  a2  a3  b1  b2  b3      x
0   1   6   2   1   8   1   True
1   2   7   3   7   9   3   True
2   3   8   4   2   4   3   True
3   4   9   5   3   2   3  False
4   5  10   6   4   2   5   True
  • Related