Pandas check if duplicate value from one column has value in another column-CodePudding

The logic that I'm looking for is that, if in x column there is a duplicate value, indicate if that value in any row has a specific string in another column. This might work with a binary function.

For instance:

X	Y	Z
1	Incorrect	A
1	Correct	G
2	Incorrect	A
2	Incorrect	G

In the table above I want to create another column 'Has Correct in Y?' that has a boolean value: if the duplicate value in X has the string "Correct" in the column Y in any of the rows that it is present in.

The result would be something like this:

X	Y	Z	Has Correct in Y?
1	Incorrect	A	TRUE
1	Correct	G	TRUE
2	Incorrect	A	FALSE
2	Incorrect	G	FALSE

CodePudding user response：

Try with

df['Corr_Y'] = df['X'].isin(df.loc[df['Y'] == 'Correct','X'])

CodePudding user response：

You can use:

df['Has Correct in Y?'] = (df['Y'].eq('Correct')
                           .groupby(df['X'])
                           .transform('any')
                           )