Home > database >  How would I groupby and see if all members of the group meet a certain condition?
How would I groupby and see if all members of the group meet a certain condition?

Time:08-06

I want to groupby and see if all members in the group meet a certain condition. Here's a dummy example:

x = ['Mike','Mike','Mike','Bob','Bob','Phil']
y = ['Attended','Attended','Attended','Attended','Not attend','Not attend']

df = pd.DataFrame({'name':x,'attendance':y})

And what I want to do is return a 3x2 dataframe that shows for each name, who was always in attendance. It should look like below:

new_df = pd.DataFrame({'name':['Mike','Bob','Phil'],'all_attended':[True,False,False]})

Whats the best way to do this?

Thanks so much.

CodePudding user response:

Let's try

out = (df['attendance'].eq('Attended')
       .groupby(df['name']).all()
       .to_frame('all_attended').reset_index())
print(out)

   name  all_attended
0   Bob         False
1  Mike          True
2  Phil         False

CodePudding user response:

one way could be:

df.groupby('name')['attendance'].apply(lambda x: True if x.unique().all()=='Attended' else False)

name
Bob     False
Mike     True
Phil    False
Name: attendance, dtype: bool

CodePudding user response:

I would say away from strings for data that does not need to be a string:

z = [s == 'Attended' for s in y]
df = pd.DataFrame({'name': x, 'attended': z})

Now you can check if all the elements for a given group are True:

>>> df.groupby('name')['attendance'].all()
name
Bob     False
Mike     True
Phil    False
Name: attendance, dtype: bool

If something can only be a 0 or 1, using a string introduces the possibility of errors because someone might type Atended instead of Attended, for example.

  • Related