Home > database >  pandas multiindex previous cell reference
pandas multiindex previous cell reference

Time:07-19

Given the following dataframe:

import pandas as pd

scenarios = ["sc1","sc2"]
products = ['p1', 'p2']
vars = ["qty"]
years = ["yr1","yr2","yr3"]
quarters = ["q1","q2","q3","q4"]

rows = pd.MultiIndex.from_product([scenarios, products, vars], names=['scenarios', 'products', 'vars'])
cols = pd.MultiIndex.from_product([years, quarters], names=['years', 'quarters'])
df1 = pd.DataFrame(index=rows, columns=cols)

df1.loc[("sc1","p1"),("yr1","q1")] = 10

df1

enter image description here

How would I go about calculating the qty of sc1/p1 for yr1/q2, yr1/q3... yr3/q4 so that ex. each new value is with 10% bigger than the previous?

Ex. yr1/q2 = yr1/q1 * 1.10 = 10 * 1.10 = 11
    yr1/q3 = yr1/q2 * 1.10 = 11 * 1.10 = 12.1
    ....

CodePudding user response:

You can fillna the NaN with your factor, then compute the cumprod:

df1.loc[('sc1', 'p1', 'qty')] = df1.loc[('sc1', 'p1', 'qty')].fillna(1.1).cumprod()

output:

years                     yr1                        yr2                     \
quarters                   q1    q2    q3     q4      q1       q2        q3   
scenarios products vars                                                       
sc1       p1       qty   10.0  11.0  12.1  13.31  14.641  16.1051  17.71561   
          p2       qty    NaN   NaN   NaN    NaN     NaN      NaN       NaN   
sc2       p1       qty    NaN   NaN   NaN    NaN     NaN      NaN       NaN   
          p2       qty    NaN   NaN   NaN    NaN     NaN      NaN       NaN   

years                                     yr3                                   
quarters                        q4         q1         q2         q3         q4  
scenarios products vars                                                         
sc1       p1       qty   19.487171  21.435888  23.579477  25.937425  28.531167  
          p2       qty         NaN        NaN        NaN        NaN        NaN  
sc2       p1       qty         NaN        NaN        NaN        NaN        NaN  
          p2       qty         NaN        NaN        NaN        NaN        NaN  
generalization to all rows
# starting from empty dataframe
df1[('yr1', 'q1')] = [10, 20, 30, 40]

df1 = df1.fillna(1.1).cumprod(axis=1)

output:

years                     yr1                        yr2                     \
quarters                   q1    q2    q3     q4      q1       q2        q3   
scenarios products vars                                                       
sc1       p1       qty   10.0  11.0  12.1  13.31  14.641  16.1051  17.71561   
          p2       qty   20.0  22.0  24.2  26.62  29.282  32.2102  35.43122   
sc2       p1       qty   30.0  33.0  36.3  39.93  43.923  48.3153  53.14683   
          p2       qty   40.0  44.0  48.4  53.24  58.564  64.4204  70.86244   

years                                     yr3                         \
quarters                        q4         q1         q2          q3   
scenarios products vars                                                
sc1       p1       qty   19.487171  21.435888  23.579477   25.937425   
          p2       qty   38.974342  42.871776  47.158954   51.874849   
sc2       p1       qty   58.461513  64.307664  70.738431   77.812274   
          p2       qty   77.948684  85.743552  94.317908  103.749698   

years                                
quarters                         q4  
scenarios products vars              
sc1       p1       qty    28.531167  
          p2       qty    57.062334  
sc2       p1       qty    85.593501  
          p2       qty   114.124668  
  • Related