Home > database >  How to add Multilevel Columns and create new column?
How to add Multilevel Columns and create new column?

Time:07-19

I am trying to create a "total" column in my dataframe

idx = pd.MultiIndex.from_product([['Room 1','Room 2', 'Room 3'],['on','off']])
df = pd.DataFrame([[1,4,3,6,5,15], [3,2,1,5,1,7]], columns=idx)

My dataframe

    Room 1  Room 2  Room 3
    on  off on  off on  off
0   1   4   3   6   5   15
1   3   2   1   5   1   7

For each room, I want to create a total column and then a on% column.

I have tried the following, however, it does not work.

df.loc[:, slice(None), "total" ] = df.xs('on', axis=1,level=1)   df.xs('off', axis=1,level=1) 

CodePudding user response:

Oof this was a roughie, but you can do it like this if you want to avoid loops. Worth noting it redefines your df twice because i need the total columns. Sorry about that, but is the best i could do. Also if you have any questions just comment.

df = pd.concat([y.assign(**{'Total {0}'.format(x 1): y.iloc[:,0]   y.iloc[:,1]})for x , y in df.groupby(np.arange(df.shape[1])//2,axis=1)],axis=1)


df = pd.concat([y.assign(**{'Percentage_Total{0}'.format(x 1): (y.iloc[:,0] / y.iloc[:,2])*100})for x , y in df.groupby(np.arange(df.shape[1])//3,axis=1)],axis=1)
print(df)

CodePudding user response:

This groups by the column's first index (rooms) and then loops through each group to add the total and percent on. The final step is to reindex using the unique rooms:

import pandas as pd

idx = pd.MultiIndex.from_product([['Room 1','Room 2', 'Room 3'],['on','off']])
df = pd.DataFrame([[1,4,3,6,5,15], [3,2,1,5,1,7]], columns=idx)

for room, group in df.groupby(level=0, axis=1):
    df[(room, 'total')] = group.sum(axis=1)
    df[(room, 'pct_on')] = group[(room, 'on')] / df[(room, 'total')]

result = df.reindex(columns=df.columns.get_level_values(0).unique(), level=0)

Output:

      Room 1                  Room 2                     Room 3                 
      on off total pct_on     on off total    pct_on     on off total pct_on
0      1   4     5    0.2      3   6     9  0.333333      5  15    20  0.250
1      3   2     5    0.6      1   5     6  0.166667      1   7     8  0.125

CodePudding user response:

Let us try something fancy ~

df.stack(0).eval('total=on   off \n on_pct=on / total').stack().unstack([1, 2])

     Room 1                   Room 2                      Room 3                  
     off   on total on_pct    off   on total    on_pct    off   on total on_pct
0    4.0  1.0   5.0    0.2    6.0  3.0   9.0  0.333333   15.0  5.0  20.0  0.250
1    2.0  3.0   5.0    0.6    5.0  1.0   6.0  0.166667    7.0  1.0   8.0  0.125
  • Related