pd.pivot_table(df1,index=['Staff'],columns=['Products'],values=['Price'],margins=True)
Results
Price
Products A B C All
Staff
Staff_A 14472.000000 15877.777778 14982.352941 14890.196078
Staff_B 14775.000000 15620.000000 16330.000000 15815.789474
Staff_C 15293.333333 14262.500000 15214.285714 15000.000000
All 14779.545455 15231.818182 15426.470588 15099.000000
The result can be shown successfully however I do not understand how the values from the column and row All
come from so can anybody explain how the value of sum comes from because I thought it should be the sum of the first three columns data added together?
CodePudding user response:
To illustrate:
import pandas as pd
import numpy as np
Create dummy dataframe:
df = pd.DataFrame({
'id': ['a', 'b', 'c', 'a', 'b', 'c'],
'stat': ['col1', 'col1', 'col1', 'col2', 'col2', 'col2'],
'val': [2, 4, 6, 6, 8, 10],
})
This will give us:
>>> df
id stat val
0 a col1 2
1 b col1 4
2 c col1 6
3 a col2 6
4 b col2 8
5 c col2 10
Pivoting without giving aggfunc
argument
table = pd.pivot_table(df, index='id', columns='stat', values='val', margins=True)
The above will give us:
>>> table
stat col1 col2 All
id
a 2 6 4.0
b 4 8 6.0
c 6 10 8.0
All 4 8 6.0
Notice that column All and row All both giving us the mean of the columns and rows, respectively.
Pivoting and using aggfunc=np.sum
:
new_table = pd.pivot_table(df, index='id', columns='stat', values='val', margins=True, aggfunc=np.sum)
This will give us:
>>> new_table
stat col1 col2 All
id
a 2 6 8
b 4 8 12
c 6 10 16
All 12 24 36