Home > Software design >  How to sum up individual columns if they have the same value in a different column?
How to sum up individual columns if they have the same value in a different column?

Time:09-28

I have the following data frame:

            Names      Counts  Year
0           Jordan        1043  2000
1            Steve         204  2000
2            Brock           3  2000
3            Steve          33  2000
4             Mike          88  2000
...           ...         ...   ...
20001        Bryce           2  2015
20002        Steve          11  2015
20003        Penny          24  2015
20004        Steve          15  2015
20005        Penny           5  2015

I want to add up all the counts for each name if they appear multiple times in a year. An example of the output might look like:

            Names      Counts  Year
0           Jordan        1043  2000
1            Steve         237  2000
2            Brock           3  2000
3             Mike          88  2000
...           ...         ...   ...
20001        Bryce           2  2015
20002        Steve          26  2015
20003        Penny          29  2015

I've tried the following:

(df[df['Names'].eq('Steve')].groupby('Year').agg({'Names': 'first', 'Counts': sum}).reset_index())

Which returns the following for individual names, but it's not what I'm looking for.

   Year  Names  Counts
0  2000  Steve     237
1  2015  Steve      26

CodePudding user response:

Try

df['Counts'] = df.groupby(['Names','Year'])['Counts'].transform('sum')

CodePudding user response:

The code that you shared looks like it is filtering the "Names" column for only the value "Steve". The below code will group the unique pairs of "Name" & "Year" combinations and sum all related "Counts" values.

tempdf = df.groupby(['Names',"Year"])['Counts'].sum().reset_index()
  • Related