Check if values in a column exist elsewhere in a dataframe row-CodePudding

Suppose I have a dataframe as below:

df = pd.DataFrame({'a':[1,2,3,4],'b':[2,3,4,5],'c':[3,4,5,6],'d':[5,3,2,4]})

I want to check if elements in column d exist elsewhere in its corresponding row. So the outcome I want is

[False, True, False, True]

Towards that end, I used

df.apply(lambda x: x['d'] in x[['a','b','c']], axis=1)

but this is somehow giving me [False, False, False, False].

CodePudding user response：

Try:

    out = (df[['a','b','c']].T==df['d']).any()

Output:

    0    False
    1     True
    2    False
    3     True
    dtype: bool

CodePudding user response：

Taking advantage of numpy broadcasting, you can do it easily:

(df[['d']].to_numpy() == df.to_numpy())[:, :-1].any(axis=1)

Output:

array([False,  True, False,  True])

Using numpy is about as fast as you can get.

CodePudding user response：

Use eq to broadcast equality comparison across the columns. Then check if there are any matches in the row:

df.drop(columns='d').eq(df['d'], axis=0).any(axis=1).array
<PandasArray>
[False, True, False, True]
Length: 4, dtype: bool

Speedwise, a numpy option is best.

CodePudding user response：

Try with

df.eq(df.pop('d'),axis=0).any(1)#.values
0    False
1     True
2    False
3     True
dtype: bool