Home > Software design >  Want to update column Values for each occurrence of column AppName where at least one of the values
Want to update column Values for each occurrence of column AppName where at least one of the values

Time:04-30

I have a table

AppName Values
JBX Yes
JBX No
JBX Yes
ABD No
ABD No
CBR Yes
LBA No
LBA Yes

And I want to update this table so that for each occurrence of AppName where at least one of the values in the "Values" column is 'Yes' then we update all of the values for that specific AppName to "Yes".

Updated Table:

AppName Values
JBX Yes
JBX Yes
JBX Yes
ABD No
ABD No
CBR Yes
LBA Yes
LBA Yes

I tried

        if df2['Values'] == 'Yes':
           df2['Values'] == 'Yes'  ```
But this throws a ```ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().``` for ```if df2['Values'] == 'Yes':```

CodePudding user response:

Approach using groupby.transform, placing results into a new column for easy comparison with original Values column:

df['app_has_any_yes'] = (df.groupby('AppName')['Values']
                           .transform(lambda x: (x == 'Yes').any())
                           .replace({True: 'Yes', False: 'No'})

Result:

AppName Values app_has_any_yes
JBX Yes Yes
JBX No Yes
JBX Yes Yes
ABD No No
ABD No No
CBR Yes Yes
LBA No Yes
LBA Yes Yes

CodePudding user response:

You can try groupby then apply

out = df.groupby('AppName').apply(lambda g: g.assign(Values=['Yes']*len(g) if g['Values'].eq('Yes').any() else ['No']*len(g)))
print(out)

  AppName Values
0     JBX    Yes
1     JBX    Yes
2     JBX    Yes
3     ABD     No
4     ABD     No
5     CBR    Yes
6     LBA    Yes
7     LBA    Yes

CodePudding user response:

Conveniently, "Yes" is lexicographically sorted after "No", so Yes>No and you can do:

df['Values'] = df.groupby('AppName')['Values'].transform('max')

NB. this is working for this particular example, but you can use the same logic with other words, for example in German "Ja" (Yes) is before "Nein" (No), so you would use transform('min') in this case

output:

  AppName Values
0     JBX    Yes
1     JBX    Yes
2     JBX    Yes
3     ABD     No
4     ABD     No
5     CBR    Yes
6     LBA    Yes
7     LBA    Yes
  • Related