Aim: check if my bad_outcomes
set includes df['Outcome']
pd dataframe column values. If the set does contain these values I want to assign them to a new variable landing_outcome
with the value of 0. If not I assign landing_outcome
a value of 1.
I am able to search a column df['Outcome']
and check if the values are in my set called 'bad_outcomes' using isin.
df[df['Outcome'].isin (bad_outcomes)]
This works. Then I try to put this in an if statement
if df[df['Outcome'].isin (bad_outcomes)]:
landing_outcome = 0
This gives me a Value error:
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Where I am going wrong? Is using isin the best way to do this?
I checked python manual for if statements and could't find an obvious syntax issue, I searched this forum for the error message (there are many posts but I couldn't see one for my use case). I'm new, I hope this is ok to ask.
CodePudding user response:
I found this answer on condition statements on [codegrepper][1] which referenced this resource on stackoverflow
Which linked back to stackoverflow here: Pandas conditional creation of a series/dataframe column
Using this approach my solution was:
landing_class=[0 if outcome in bad_outcomes else 1 for outcome in df['Outcome']]
CodePudding user response:
Try using .loc
df.loc[df['Outcome'].isin(bad_outcomes), "landing_outcome"] = 0
df.loc[~df['Outcome'].isin(bad_outcomes), "landing_outcome"] = 1
If this helps, do approve the solution and upvote it.