How to sum up individual columns if they have the same value in a different column?-CodePudding

I have the following data frame:

            Names      Counts  Year
0           Jordan        1043  2000
1            Steve         204  2000
2            Brock           3  2000
3            Steve          33  2000
4             Mike          88  2000
...           ...         ...   ...
20001        Bryce           2  2015
20002        Steve          11  2015
20003        Penny          24  2015
20004        Steve          15  2015
20005        Penny           5  2015

I want to add up all the counts for each name if they appear multiple times in a year. An example of the output might look like:

            Names      Counts  Year
0           Jordan        1043  2000
1            Steve         237  2000
2            Brock           3  2000
3             Mike          88  2000
...           ...         ...   ...
20001        Bryce           2  2015
20002        Steve          26  2015
20003        Penny          29  2015

I've tried the following:

(df[df['Names'].eq('Steve')].groupby('Year').agg({'Names': 'first', 'Counts': sum}).reset_index())

Which returns the following for individual names, but it's not what I'm looking for.

   Year  Names  Counts
0  2000  Steve     237
1  2015  Steve      26

CodePudding user response：

Try

df['Counts'] = df.groupby(['Names','Year'])['Counts'].transform('sum')

CodePudding user response：

The code that you shared looks like it is filtering the "Names" column for only the value "Steve". The below code will group the unique pairs of "Name" & "Year" combinations and sum all related "Counts" values.

tempdf = df.groupby(['Names',"Year"])['Counts'].sum().reset_index()