Home > Software engineering >  Perform a calculation as a new column on multiindex dataframe
Perform a calculation as a new column on multiindex dataframe

Time:05-05

I have the following multiindex dataframe:

                                            MSISDN_COUNT
NAME            DIM1    SEGMENT 
JAN22           N       Detractor           307
                        Passive             198
                        Promoter            1256
                Y       Detractor           237
                        Passive             161
                        Promoter            1096
FEB22           N       Detractor           313
                        Passive             222
                        Promoter            1260
                Y       Detractor           261
                        Passive             169
                        Promoter            1155

I want to create a new column, which will be calculated for each NAME and for each DIM1.

The calculation is: ((Promoter - Detractor) / (Promoter Passive Detractor) )* 100

So the expected output is

                                            MSISDN_COUNT    OUT
NAME            DIM1    SEGMENT 
JAN22           N       Detractor           307             53.88
                        Passive             198
                        Promoter            1256
                Y       Detractor           237             57.49
                        Passive             161
                        Promoter            1096
FEB22           N       Detractor           313             52.76
                        Passive             222
                        Promoter            1260
                Y       Detractor           261             56.4
                        Passive             169
                        Promoter            1155

I can't figure out how to do this with the multi-level index dataframe.

Any help is appreciated!

CodePudding user response:

Use DataFrame.xs for select DataFrames for 3 level, apply your formula, add third level SEGMENT with value Detractor and assign back:

Promoter = df.xs('Promoter', level=2)
Detractor = df.xs('Detractor', level=2)
Passive = df.xs('Passive', level=2)

s = ((Promoter - Detractor) / (Promoter   Passive   Detractor) )* 100

df['OUT'] = s.assign(SEGMENT='Detractor').set_index('SEGMENT', append=True)
print (df)
                      MSISDN_COUNT        OUT
NAME  DIM1 SEGMENT                           
JAN22 N    Detractor           307  53.889835
           Passive             198        NaN
           Promoter           1256        NaN
      Y    Detractor           237  57.496653
           Passive             161        NaN
           Promoter           1096        NaN
FEB22 N    Detractor           313  52.757660
           Passive             222        NaN
           Promoter           1260        NaN
      Y    Detractor           261  56.403785
           Passive             169        NaN
           Promoter           1155        NaN

Details:

print (s)
            MSISDN_COUNT
NAME  DIM1              
JAN22 N        53.889835
      Y        57.496653
FEB22 N        52.757660
      Y        56.403785
      
print (s.assign(SEGMENT='Detractor').set_index('SEGMENT', append=True))
                      MSISDN_COUNT
NAME  DIM1 SEGMENT                
JAN22 N    Detractor     53.889835
      Y    Detractor     57.496653
FEB22 N    Detractor     52.757660
      Y    Detractor     56.403785
  • Related