Home > Blockchain >  creating multi index from data grouped by month in Pandas
creating multi index from data grouped by month in Pandas

Time:12-24

Consider this sample data:

Month  Location  Products    Sales  Profit
JAN      1        43          32     20  
JAN      2        82          54     25
JAN      3        64          43     56
FEB      1        37          28     78
FEB      2        18          15     34
FEB      3        5           2      4
MAR      1        47          40     14

The multi-index transformation I am trying to achieve is this:

          JAN                     FEB                     MAR
Location  Products Sales Profit   Products Sales Profit   Products Sales Profit
1         43       32    29       37       28    78       47       40    14
2         82       54    25       18       15    34       null     null  null
3         64       43    56       5        2     4        null     null  null

I tried versions of this:

df.stack().to_frame().T

It put all the data into one row. So, that's not the goal.
I presume I am close in that it should be a stacking or unstacking, melting or unmelting, but my attempts have all resulted in data oatmeal at this point. Appreciate your time trying to solve this one.

CodePudding user response:

Use pivot:

>>> df.pivot('Location', 'Month').swaplevel(axis=1).sort_index(axis=1)

Month         FEB                   JAN                   MAR             
         Products Profit Sales Products Profit Sales Products Profit Sales
Location                                                                  
1            37.0   78.0  28.0     43.0   20.0  32.0     47.0   14.0  40.0
2            18.0   34.0  15.0     82.0   25.0  54.0      NaN    NaN   NaN
3             5.0    4.0   2.0     64.0   56.0  43.0      NaN    NaN   NaN

To preserve order, you have to transform your Month column as CategoricalDtype before:

df['Month'] = df['Month'].astype(pd.CategoricalDtype(df['Month'].unique(), ordered=True))
out = df.pivot('Location', 'Month').swaplevel(axis=1).sort_index(axis=1)
print(out)

# Output:
Month         JAN                   FEB                   MAR             
         Products Profit Sales Products Profit Sales Products Profit Sales
Location                                                                  
1            43.0   20.0  32.0     37.0   78.0  28.0     47.0   14.0  40.0
2            82.0   25.0  54.0     18.0   34.0  15.0      NaN    NaN   NaN
3            64.0   56.0  43.0      5.0    4.0   2.0      NaN    NaN   NaN

Update 2

Try to force the order of level 2 columns:

df1 = df.set_index(['Month', 'Location'])
df1.columns = pd.CategoricalIndex(df1.columns, ordered=True)
df1 = df1.unstack('Month').swaplevel(axis=1).sort_index(axis=1)

CodePudding user response:

You can use pivot with reorder_levels and sort_index():

df.pivot(index='Location',columns='Month').reorder_levels(order=[1,0],axis=1).sort_index(axis=1)

Month         FEB                   JAN                   MAR             
         Products Profit Sales Products Profit Sales Products Profit Sales
Location                                                                  
1            37.0   78.0  28.0     43.0   20.0  32.0     47.0   14.0  40.0
2            18.0   34.0  15.0     82.0   25.0  54.0      NaN    NaN   NaN
3             5.0    4.0   2.0     64.0   56.0  43.0      NaN    NaN   NaN

In case you are interested, this answer elaborates between swaplevel and reoder_levels.

  • Related