I am looking to multiply all the column values by the column header in pandas dataframe which is shown in pivot table. Does anyone have a good solution for this?
df = pd.DataFrame({'ID':['one2', 'one3', 'one3', 'one4' ],
'Requested Volume in Million':[5.0, 6.0, 7.0, 2.2],
'BOX':[1,2,3,4],
'Date':['2018-06-14', '2016-12-06', '2017-01-05', '2016-11-22']})
df2=pd.pivot_table(df, columns ='Requested Volume in Million',index =['ID','Date'], values ='BOX', aggfunc ='count', margins=True, fill_value=0)
df2 = df2.apply(lambda x : x * float((x.name)[0]) if type(x.name[0]) != float else x, axis=0)
CodePudding user response:
You can do this with apply and a lambda. Each column name, which we cast to a float, can be accessed with .name. We use axis 1 to operate on the columns.
df = pd.DataFrame({'0.5':[0.5, 0.2, -0.2, 0.5 ],
'2':[10, 5, -5, 10]})
df = df.apply(lambda x : x * float(x.name), axis=0)
Output df:
0.5 2
0 0.25 20.0
1 0.10 10.0
2 -0.10 -10.0
3 0.25 20.0
Edit Pivot Table Solution:
import numpy as np
import pandas as pd
df = pd.DataFrame({5: {0: 20130320, 1: 20130320, 2: 20130320, 3: 20130320, 4: 20130320},
10: {0: 8, 1: 12, 2: 16, 3: 20, 4: 24},
20: {0: 1, 1: 3, 2: 5, 3: 7, 4: 9},
30: {0: 20, 1: 30, 2: 10, 3: 40, 4: 30},
40: {0: 400, 1: 500, 2: 500, 3: 200, 4: 300},
50: {0: 1000, 1: 1100, 2: 900, 3: 1300, 4: 800}})
df = pd.pivot_table(df, values=[30, 40, 50], index=[5], columns=[10], aggfunc='mean', margins=True)
df = df.apply(lambda x : x * float((x.name)[1]) if type(x.name[1]) != str else x, axis=0)
Output df:
30 40 50
10 8 12 16 20 24 All 8 12 16 20 24 All 8 12 16 20 24 All
5
20130320 20 30 10 40 30 26.0 400 500 500 200 300 380.0 1000 1100 900 1300 800 1020.0
All 20 30 10 40 30 26.0 400 500 500 200 300 380.0 1000 1100 900 1300 800 1020.0
Edit with OP's data:
df = pd.DataFrame({'PID':['CIRC', 'CIRC0006', 'CIRC0054', 'CIRC9876' ],
'Requested Volume in Million':[5.0, 6.0, 7.0, 2.2],
'LiN2 Position in Box':[1,2,3,4],
'Freezing Date':['2018-06-14', '2016-12-06', '2017-01-05', '2016-11-22']})
df2=pd.pivot_table(df, columns ='Requested Volume in Million',index =['PID','Freezing Date'], values ='LiN2 Position in Box', aggfunc ='count', margins=True, fill_value=0)
df3 = df2.apply(lambda x : x * float((x.name)) if type(x.name) != str else x, axis=0)
Ouput df2:
Requested Volume in Million 2.2 5.0 6.0 7.0 All
PID Freezing Date
CIRC 2018-06-14 0 1 0 0 1
CIRC0006 2016-12-06 0 0 1 0 1
CIRC0054 2017-01-05 0 0 0 1 1
CIRC9876 2016-11-22 1 0 0 0 1
All 1 1 1 1 4
Output df3:
Requested Volume in Million 2.2 5.0 6.0 7.0 All
PID Freezing Date
CIRC 2018-06-14 0.0 5.0 0.0 0.0 1
CIRC0006 2016-12-06 0.0 0.0 6.0 0.0 1
CIRC0054 2017-01-05 0.0 0.0 0.0 7.0 1
CIRC9876 2016-11-22 2.2 0.0 0.0 0.0 1
All 2.2 5.0 6.0 7.0 4