Suppose I have a dataframe as below:
df = pd.DataFrame({'a':[1,2,3,4],'b':[2,3,4,5],'c':[3,4,5,6],'d':[5,3,2,4]})
I want to check if elements in column d
exist elsewhere in its corresponding row. So the outcome I want is
[False, True, False, True]
Towards that end, I used
df.apply(lambda x: x['d'] in x[['a','b','c']], axis=1)
but this is somehow giving me [False, False, False, False]
.
CodePudding user response:
Try:
out = (df[['a','b','c']].T==df['d']).any()
Output:
0 False
1 True
2 False
3 True
dtype: bool
CodePudding user response:
Taking advantage of numpy broadcasting, you can do it easily:
(df[['d']].to_numpy() == df.to_numpy())[:, :-1].any(axis=1)
Output:
array([False, True, False, True])
Using numpy is about as fast as you can get.
CodePudding user response:
Use eq to broadcast equality comparison across the columns. Then check if there are any matches in the row:
df.drop(columns='d').eq(df['d'], axis=0).any(axis=1).array
<PandasArray>
[False, True, False, True]
Length: 4, dtype: bool
Speedwise, a numpy option is best.
CodePudding user response:
Try with
df.eq(df.pop('d'),axis=0).any(1)#.values
0 False
1 True
2 False
3 True
dtype: bool