I am getting the above error if I add 'column' in the pivot table function of pandas. Without 'Column' it works fine.
df1 = {'Material Code':['A','B','C','D','E','F','G'], 'Material Desc':['ABC','XYZ','QWE','ERT','MNH','LKJ','HGF'], 'Vendor':['Vendor_1','Vendor_2','Vendor_3','Vendor_4','Vendor_5','Vendor_6','Vendor_7'],'Currency':['INR','INR','INR','INR','USD','USD','USD'],'Months':['JAN','FEB','MAR','JAN','FEB','JAN','APR'], 'GRN Quantity':[100, 200, 500, 400, 100, 100, 500], 'Purchase Price':[10, 15, 20, 45, 45, 12, 15]}
index_data = ['Material Code', 'Material Desc', 'Vendor', 'Currency']
column_data = 'Months'
pf = pd.pivot_table(data=df1,index=index_data, columns=column_data,values=values_data,aggfunc={"GRN Quantity":np.sum, "Purchase Price":np.sum}, fill_value=0).reset_index()
It would be great help, I am struck mid of a project. Thanks You
CodePudding user response:
You forgot to create the dataframe and set values_data
. pivot_table
requires a dataframe as data not a dict.
df1 = pd.DataFrame(df1)
index_data = ['Material Code', 'Material Desc', 'Vendor', 'Currency']
column_data = 'Months'
values_data = df1.columns.difference(index_data [column_data])
months = dict(zip(['JAN', 'FEB', 'MAR', 'APR', 'MAY', 'JUN', 'JUL', 'AUG',
'SEP', 'OCT', 'NOV', 'DEC'], range(12)))
pf = pd.pivot_table(data=df1, index=index_data, columns=column_data,
values=values_data, fill_value=0,
aggfunc={"GRN Quantity":np.sum, "Purchase Price":np.sum}) \
.swaplevel(axis=1).sort_index(level=0, axis=1, key=lambda x: [months[i] for i in x]).reset_index()
Output:
>>> pf
Months Material Code Material Desc Vendor Currency JAN FEB MAR APR
GRN Quantity Purchase Price GRN Quantity Purchase Price GRN Quantity Purchase Price GRN Quantity Purchase Price
0 A ABC Vendor_1 INR 100 10 0 0 0 0 0 0
1 B XYZ Vendor_2 INR 0 0 200 15 0 0 0 0
2 C QWE Vendor_3 INR 0 0 0 0 500 20 0 0
3 D ERT Vendor_4 INR 400 45 0 0 0 0 0 0
4 E MNH Vendor_5 USD 0 0 100 45 0 0 0 0
5 F LKJ Vendor_6 USD 100 12 0 0 0 0 0 0
6 G HGF Vendor_7 USD 0 0 0 0 0 0 500 15