I've got a dataframe build fromRecords
a django queryset, that I pivot by 2 columns to get a dashboard view of it.
I manage to have the global sums byr row and column of the whole table, but I'm trying to get the sum by the first pivot columns (a sub total by row for the first column of each group).
I know absolutely nothing of pandas, but I'm learning.
My dataFrame looks like :
type amount source fund
0 Ressource Humaine CDD -36470.36 Expense fund2
1 Mission -1686.47 Expense fund2
2 Fonctionnement -817465.91 Expense fund1
3 Fonctionnement 1118691.65 Budget fund1
4 Fonctionnement -6000 Expense fund3
5 Fonctionnement -23621.83 Expense fund2
6 Frais de Gestion -53499 Expense fund2
7 Fonctionnement 15000 Budget fund3
8 Frais de Gestion 53499 Budget fund2
9 Fonctionnement 186718.78 Budget fund2
10 Mission 1686.47 Budget fund2
1 Ressource Humaine CDD 38676.53 Budget fund2
To get an overview of availability by funds in my dash board I pivot it like :
piv=cpd.pivot_table(index="type", columns=["fund","source"], values="amount", aggfunc='sum', margins=True, margins_name='Sum')
to get :
fund fund1 fund2 fund3
source Budget Expense Budget Expense Budget Expense
type
Fonctionnement 1118691.65 -817465.91 186718.78 -23621.83 15000.00 -6000.00
Frais de Gestion NaN NaN 53499.00 -53499.00 NaN NaN
Mission NaN NaN 1686.47 -1686.47 NaN NaN
Ressource Humaine CDD NaN NaN 38676.53 -36470.36 NaN NaN
(it misses the total here, but I've got it)
I would like to land on something like :
fund fund1 fund2 fund3
source Budget Expense total fund1 Budget Expense total fund2 Budget Expense total fund3
type
Fonctionnement 1118691.65 -817465.91 301 226€ 186718.78 -23621.83 163 097€ 15000.00 -6000.00 9 000€
Frais de Gestion NaN NaN NaN 53499.00 -53499.00 0 NaN NaN NaN
Mission NaN NaN NaN 1686.47 -1686.47 0 NaN NaN NaN
Ressource Humaine CDD NaN NaN NaN 38676.53 -36470.36 2 207€ NaN NaN NaN
I've seen some tips with pandas concat for multi index pivot (eg : Pivot table subtotals in Pandas)
I'm trying to loop by columns or read header or... but I can go further as I'm a deep noob!
How could I insert/append a mid column with sum, and how calculate this sub sum?
CodePudding user response:
You can do normal pivot and then compute/append the total sum:
# do a normal pivot
df = df.pivot_table(
index="type",
columns=["fund", "source"],
values="amount",
aggfunc="sum",
)
# compute "sum" dataframes
dfs = []
for c in df.columns.get_level_values(0).unique():
s = df.loc[:, c].sum(axis=1, skipna=False)
dfs.append(pd.DataFrame(s, index=s.index, columns=[(c, f"Total {c}")]))
# concat them together, sort the columns:
out = pd.concat([df, pd.concat(dfs, axis=1)], axis=1)
out = out[sorted(out.columns)]
print(out)
Prints:
fund fund1 fund2 fund3
source Budget Expense Total fund1 Budget Expense Total fund2 Budget Expense Total fund3
type
Fonctionnement 1118691.65 -817465.91 301225.74 186718.78 -23621.83 163096.95 15000.0 -6000.0 9000.0
Frais de Gestion NaN NaN NaN 53499.00 -53499.00 0.00 NaN NaN NaN
Mission NaN NaN NaN 1686.47 -1686.47 0.00 NaN NaN NaN
Ressource Humaine CDD NaN NaN NaN 38676.53 -36470.36 2206.17 NaN NaN NaN