I have a dataframe like so:
df = pd.DataFrame(
{
"day": ["Mon", "Mon", "Mon", "Mon", "Tues", "Tues", "Tues", "Tues"],
"name": ["James", "Rover", "Cleo", "X", "Bran", "Excaliber", "Henry", "Mia"],
"species": ['dog', 'dog', 'cat', 'cat', 'dog', 'dog', 'cat', 'cat'],
"fleas": [0, 1, 2, 4, 6, 7, 8, 3],
}
)
>>> df
day name species fleas
0 Mon James dog 0
1 Mon Rover dog 1
2 Mon Cleo cat 2
3 Mon X cat 4
4 Tues Bran dog 6
5 Tues Excaliber dog 7
6 Tues Henry cat 8
7 Tues Mia cat 3
Each row corresponds to a measurement on a particular day of how many fleas were on a particular animal. What I want to do is groupby the day and then sum all the fleas that were on the dogs, and create a new column dog_fleas
with the result. The result of this operation should look like
day name species fleas dog_fleas
0 Mon James dog 0 1
1 Mon Rover dog 1 1
2 Mon Cleo cat 2 1
3 Mon X cat 4 1
4 Tues Bran dog 6 13
5 Tues Excaliber dog 7 13
6 Tues Henry cat 8 13
7 Tues Mia cat 3 13
How do I do this in Pandas?
CodePudding user response:
Let us try transform
with where
df['new'] = df['fleas'].where(df['species']=='dog').groupby(df['day']).transform('sum')
df
Out[88]:
day name species fleas new
0 Mon James dog 0 1.0
1 Mon Rover dog 1 1.0
2 Mon Cleo cat 2 1.0
3 Mon X cat 4 1.0
4 Tues Bran dog 6 13.0
5 Tues Excaliber dog 7 13.0
6 Tues Henry cat 8 13.0
7 Tues Mia cat 3 13.0
CodePudding user response:
Groupby day
column, and apply a function to count dogs
, convert to a dataframe, and merge back on day
column.
>>> df.merge(
df
.groupby('day')
.apply(lambda x: x.loc[x['species'].eq('dog'), 'fleas'].sum())
.to_frame('dog_fleas'),
on='day')
day name species fleas dog_fleas
0 Mon James dog 0 1
1 Mon Rover dog 1 1
2 Mon Cleo cat 2 1
3 Mon X cat 4 1
4 Tues Bran dog 6 13
5 Tues Excaliber dog 7 13
6 Tues Henry cat 8 13
7 Tues Mia cat 3 13
CodePudding user response:
Grouping by day
, and calculating the sum of fleas on the dogs for that day:
def f(df):
return pd.Series(df.loc[df['species'].eq('dog'), 'fleas'].sum(),
index=df.index)
df['dog_fleas'] = df.groupby('day', group_keys=False).apply(f)
day name species fleas dog_fleas
0 Mon James dog 0 1
1 Mon Rover dog 1 1
2 Mon Cleo cat 2 1
3 Mon X cat 4 1
4 Tues Bran dog 6 13
5 Tues Excaliber dog 7 13
6 Tues Henry cat 8 13
7 Tues Mia cat 3 13