Checking If Any Of Several Columns In Pandas Are Equal To Any Of Several Others In Pandas-CodePudding

I have a DataFrame in Pandas where I want to check if any of several columns are equal to several others. For example:

mydf = pd.DataFrame({'a1': [1, 2, 3, 4, 5], 'a2': [6, 7, 8, 9, 10], 'a3': [2, 3, 4, 5, 6], 'b1': [1, 7, 2, 3, 4], 'b2': [8, 9, 4, 2, 2], 'b3': [1, 3, 3, 3, 5]})

   a1  a2  a3  b1  b2  b3
0   1   6   2   1   8   1
1   2   7   3   7   9   3
2   3   8   4   2   4   3
3   4   9   5   3   2   3
4   5  10   6   4   2   5

I want to check if any of the 'a' columns are equal to any of the 'b' columns producing this DataFrame:

   a1  a2  a3  b1  b2  b3  x
0   1   6   2   1   8   1  1
1   2   7   3   7   9   3  1
2   3   8   4   2   4   3  1
3   4   9   5   3   2   3  0
4   5  10   6   4   2   5  1

Is there a nice way to do this? I only currently know quite a nasty way with loops. Thanks a lot.

CodePudding user response：

One way is to use list comprehension:

print ([any(set(x)&set(y)) for x, y in zip(df.filter(like="a").values, df.filter(like="b").values)])

[True, True, True, False, True]

CodePudding user response：

We can use numpy broadcasting to compare the values:

a = mydf.filter(like='a').to_numpy()
b = mydf.filter(like='b').to_numpy()

mydf['x'] = (a[None, :].T == b).any(2).any(0)

   a1  a2  a3  b1  b2  b3      x
0   1   6   2   1   8   1   True
1   2   7   3   7   9   3   True
2   3   8   4   2   4   3   True
3   4   9   5   3   2   3  False
4   5  10   6   4   2   5   True