Home > Mobile >  Pandas check if duplicate value from one column has value in another column
Pandas check if duplicate value from one column has value in another column

Time:08-30

The logic that I'm looking for is that, if in x column there is a duplicate value, indicate if that value in any row has a specific string in another column. This might work with a binary function.

For instance:

X Y Z
1 Incorrect A
1 Correct G
2 Incorrect A
2 Incorrect G

In the table above I want to create another column 'Has Correct in Y?' that has a boolean value: if the duplicate value in X has the string "Correct" in the column Y in any of the rows that it is present in.

The result would be something like this:

X Y Z Has Correct in Y?
1 Incorrect A TRUE
1 Correct G TRUE
2 Incorrect A FALSE
2 Incorrect G FALSE

CodePudding user response:

Try with

df['Corr_Y'] = df['X'].isin(df.loc[df['Y'] == 'Correct','X'])

CodePudding user response:

You can use:

df['Has Correct in Y?'] = (df['Y'].eq('Correct')
                           .groupby(df['X'])
                           .transform('any')
                           )
  • Related