Home > Blockchain >  Clear column based on a value appearing in another column
Clear column based on a value appearing in another column

Time:04-11

I feel like this should be really simple but I have been stuck on this all day.

I have a dataframe that looks like:

Name    Date    Graduated    Really?

Bob     2014      Yes   
Bob     2020                   Yes
Sally   1995      Yes
Sally   1999
Sally   1999                   No
Robert  2005                   Yes
Robert  2005      Yes

I am grouping by Name and Date. In each group, if "Yes" appears in Graduated, then clear the Really? column. And if "Yes" doesn't appear in the group, then leave as is. The output should look like:

Name    Date    Graduated    Really?

Bob     2014      Yes   
Bob     2020                   Yes         
Sally   1995      Yes
Sally   1999 
Sally   1999                   No             
Robert  2005                  
Robert  2005      Yes

I keep trying different variations of mask = df.groupby(['Name','Date'])['Graduated'].isin('Yes') before doing df.loc[mask, "Really?"] = Nonebut receive AttributeError (I assume my syntax is incorrect).

Edited expected output.

CodePudding user response:

Try this:

s = df['Graduated'].eq('Yes').groupby([df['Name'],df['Date']]).transform('any')
df['Really?'] = df['Really?'].mask(s,'')
  • Related