Currently my csv looks like this:
title | field1 | field2 | field3 |
---|---|---|---|
A | A1 | A11 | 553 |
A | A1 | A12 | 94 |
A | A1 | A13 | 30 |
A | A2 | A21 | 200 |
A | A3 | A31 | 35 |
But I am wanting it to look like this:
title | field1 | field2 | field3 |
---|---|---|---|
A | A1 | A11 | 553 |
A | A1 | A12 | 94 |
A | A1 | A13 | 30 |
A | A1 | total | 657 |
A | A2 | A21 | 200 |
A | A2 | total | 200 |
A | A3 | A31 | 35 |
A | A3 | total | 35 |
This is my code:
def fun(df, cols_to_aggregate, cols_order):
df = df.groupby(['field1', 'field2'], as_index=False)\
.agg(cols_to_aggregate)
df['title'] = 'A'
df = df[cols_order]
return df
def create_csv(df, month_date):
cols_to_aggregate = {'field3': 'sum'}
cols_order = ['title', 'field1', 'field2', 'field3']
funCSV = fun(df, cols_to_aggregate, cols_order)
return funCSV
Any help would be appreciated as I can't figure out how to add the new row to the table. I tried this:
total = df.groupby('field2')['field3'].sum()
But it just adds the numbers to the end of the table instead of integrating them in the table with the relevant other fields.
CodePudding user response:
Use concat
with sorting by both columns with aggregate DataFrame:
def fun(df, cols_to_aggregate, cols_order):
df = df.groupby(['field1', 'field2'], as_index=False)\
.agg(cols_to_aggregate)
total = df.groupby('field1', as_index=False)['field3'].sum().assign(field2='total')
df = pd.concat([df, total]).sort_values(['field1','field2'], ignore_index=True)
df['title'] = 'A'
print (df)
df = df[cols_order]
return df