I have the following data frame:
Names Counts Year
0 Jordan 1043 2000
1 Steve 204 2000
2 Brock 3 2000
3 Steve 33 2000
4 Mike 88 2000
... ... ... ...
20001 Bryce 2 2015
20002 Steve 11 2015
20003 Penny 24 2015
20004 Steve 15 2015
20005 Penny 5 2015
I want to add up all the counts for each name if they appear multiple times in a year. An example of the output might look like:
Names Counts Year
0 Jordan 1043 2000
1 Steve 237 2000
2 Brock 3 2000
3 Mike 88 2000
... ... ... ...
20001 Bryce 2 2015
20002 Steve 26 2015
20003 Penny 29 2015
I've tried the following:
(df[df['Names'].eq('Steve')].groupby('Year').agg({'Names': 'first', 'Counts': sum}).reset_index())
Which returns the following for individual names, but it's not what I'm looking for.
Year Names Counts
0 2000 Steve 237
1 2015 Steve 26
CodePudding user response:
Try
df['Counts'] = df.groupby(['Names','Year'])['Counts'].transform('sum')
CodePudding user response:
The code that you shared looks like it is filtering the "Names" column for only the value "Steve". The below code will group the unique pairs of "Name" & "Year" combinations and sum all related "Counts" values.
tempdf = df.groupby(['Names',"Year"])['Counts'].sum().reset_index()