I have a dataframe following this pattern:
Measure | Key | mai/21 | jun/21 | jul/21 | ... | mar/22 | apr/22 |
---|---|---|---|---|---|---|---|
Units | 4040 | 100 | 102 | 104 | ... | 200 | 204 |
Total Value | 4040 | 10500 | 10700 | 11000 | ... | 22000 | 22200 |
Units | 3230 | 120 | 130 | 140 | ... | 230 | 240 |
Total Value | 3230 | 120000 | 130000 | 140000 | ... | 345000 | 360000 |
I need to get a average value per unit and per key, so for every key I have in the dataframe, I need to get the total value row and divide by units row, so I get the average value per unit of each month, and the results needs to follow the same as the input. Just like that:
Measure | Key | mai/21 | jun/21 | jul/21 | ... | mar/22 | apr/22 |
---|---|---|---|---|---|---|---|
Value per Units | 4040 | 105 | 104.90 | 105.77 | ... | 110 | 108.82 |
Value per Units | 3230 | 1000 | 1000 | 1000 | ... | 1500 | 1500 |
How can I do it?
Thanks.
CodePudding user response:
here is one way to do it, revised after the question was updated
i=0
df2=pd.DataFrame(columns=df.columns)
while (i < len(df)):
df2.loc[len(df2)] = ['Value per unit', df.iloc[i,1]] df.iloc[i 1,2:].divide(df.iloc[i,2:] ).tolist()
i =2
df2
Measure Key mai/21 jun/21 jul/21 mar/22 apr/22
0 Value per unit 4040 105.0 104.901961 105.769231 110.0 108.823529
1 Value per unit 3230 1000.0 1000.000000 1000.000000 1500.0 1500.000000
CodePudding user response:
Assuming "Units" and "Total Value" are repeating for different keys, you can do:
df2 = df.groupby('Key').apply(lambda x: (x.T['Total Value'] / x.T['Units']).drop('Key')).reset_index()
df_out = pd.concat([
pd.DataFrame({'Measure': ['Value per units'] * df2.shape[0]}),
df2
], axis=1)