The logic that I'm looking for is that, if in x column there is a duplicate value, indicate if that value in any row has a specific string in another column. This might work with a binary function.
For instance:
X | Y | Z |
---|---|---|
1 | Incorrect | A |
1 | Correct | G |
2 | Incorrect | A |
2 | Incorrect | G |
In the table above I want to create another column 'Has Correct in Y?'
that has a boolean value: if the duplicate value in X has the string "Correct" in the column Y in any of the rows that it is present in.
The result would be something like this:
X | Y | Z | Has Correct in Y? |
---|---|---|---|
1 | Incorrect | A | TRUE |
1 | Correct | G | TRUE |
2 | Incorrect | A | FALSE |
2 | Incorrect | G | FALSE |
CodePudding user response:
Try with
df['Corr_Y'] = df['X'].isin(df.loc[df['Y'] == 'Correct','X'])
CodePudding user response:
You can use:
df['Has Correct in Y?'] = (df['Y'].eq('Correct')
.groupby(df['X'])
.transform('any')
)