Let's propose I have a pivot table that looks like this:
pd.pivot_table(
data,
columns=['A','B','C'],
values='widgets',
aggfunc='count'
).T
[Column] Count
A B C 1
D 2
E F 3
G 4
H I J 5
K L 6
What I want is:
A 10 B 3 C 1
D 2
E 7 F 3
G 4
H 11 I 11 J 5
L 6
with intermediary sums of each category in between the final count.
CodePudding user response:
Make sure index levels are named:
df = pd.DataFrame(
{'Count': [1, 2, 3, 4, 5, 6]},
pd.MultiIndex.from_tuples([
('A', 'B', 'C'),
('A', 'B', 'D'),
('A', 'E', 'F'),
('A', 'E', 'G'),
('H', 'I', 'J'),
('H', 'K', 'L')
], names=['One', 'Two', 'Three'])
)
df
Count
One Two Three
A B C 1
D 2
E F 3
G 4
H I J 5
K L 6
from functools import reduce
import pandas as pd
names = df.index.names
reduce(
pd.DataFrame.join,
[df.groupby(level=names[:i 1]).sum().add_suffix(f'_{names[i]}')
for i in range(df.index.nlevels)]
)
Count_One Count_Two Count_Three
One Two Three
A B C 10 3 1
D 10 3 2
E F 10 7 3
G 10 7 4
H I J 11 5 5
K L 11 6 6