pandas aggregate column doesnt exist?-CodePudding

Currently i have a dataframe that i am preforming a group by on with aggregate functions. these are the functions

            aggregation_functions = {
            '12_months': 'sum',
            '24_months': 'sum',
            '36_months': 'sum',
            'number_36_months': 'sum'
            }

when i do the group by it is dropping an ID column that is classed as a "nuisance"

but when i add the aggregate function for this ID column im getting the error:

[ERROR] 03/25/2022 12:24:44 PM - Column(s) ['id'] do not exist

this is the aggregation im trying to add and this is the group by

'id': 'nunique'
final_df = df.groupby(['buy_country', 'buy_activity', 'vd_country', 'vd_activity'], as_index=False).aggregate(aggregation_functions)

the column does exist in the data frame df

does anyone know why it thinks the column doesnt exist or how to get the aggregate function for this column to work ?

example of the data:

id	buy_country	buy_activity	vd_country	vd_activity	number_of_buyers	number_36_months
000002	GB	Not Matched	GB	Not Matched	1	1
000002	GB	Not Matched	GB	Not Matched	1	4
000002	GB	Not Matched	GB	Not Matched	1	2
000002	GB	Not Matched	GB	Not Matched	1	1

CodePudding user response：

Are you sure that id is a column and not an index?

You could try resetting the index of your DataFrame before you groupby:

df = df.reset_index()
final_df = df.groupby(['buy_country', 'buy_activity', 'vd_country', 'vd_activity'], as_index=False).aggregate(aggregation_functions)