Home > Software engineering >  Get last value from pandas groupeddataframe, summed by another column
Get last value from pandas groupeddataframe, summed by another column

Time:05-05

I have the following dataframe

x = pd.DataFrame(
  {
    'FirstGroupCriterium': [1,1,2,2,3],
    'SortingCriteria': [1,1,1,2,1],
    'Value': [10,20,30,40,50]
  }
)
x.sort_values('SortingCriteria').groupby('FirstGroupCriterium').agg(last_value=('Value', 'last'))

The latter outputs:

FirstGroupCriterium last_value
1 20
2 40
3 50

What I would like to have, is to sum up the last value, based on the last SortingCriteria. So in this case:

FirstGroupCriterium last_value
1 10 20 = 30
2 40
3 50

My initial idea was to call a custom aggregator function that groups the data yet again, but that fails.

def last_value(group):
  return group.groupby('SortingCriteria')['Value'].sum().tail(1)

Do you have any idea how to get this to work? Thank you!

CodePudding user response:

Sorting by both columns first, then filter last rows per FirstGroupCriterium in GroupBy.transform and aggregate sum:

df = x.sort_values(['FirstGroupCriterium','SortingCriteria'])

df1 = df[df['SortingCriteria'].eq(df.groupby('FirstGroupCriterium')['SortingCriteria'].transform('last'))]
print (df1)
   FirstGroupCriterium  SortingCriteria  Value
0                    1                1     10
1                    1                1     20
3                    2                2     40
4                    3                1     50
   
   
df2 = df1.groupby(['FirstGroupCriterium'],as_index=False)['Value'].sum()
print (df2)
   FirstGroupCriterium  Value
0                    1     30
1                    2     40
2                    3     50

Anoter idea is aggregate sum by both columns and then remove duplicates with keep last row by DataFrame.drop_duplicates:

df2 = (df.groupby(['FirstGroupCriterium','SortingCriteria'],as_index=False)['Value'].sum()
         .drop_duplicates(['FirstGroupCriterium'], keep='last'))
print (df2)

   FirstGroupCriterium  SortingCriteria  Value
0                    1                1     30
2                    2                2     40
3                    3                1     50
  • Related