creating multi index from data grouped by month in Pandas-CodePudding

Consider this sample data:

Month  Location  Products    Sales  Profit
JAN      1        43          32     20  
JAN      2        82          54     25
JAN      3        64          43     56
FEB      1        37          28     78
FEB      2        18          15     34
FEB      3        5           2      4
MAR      1        47          40     14

The multi-index transformation I am trying to achieve is this:

          JAN                     FEB                     MAR
Location  Products Sales Profit   Products Sales Profit   Products Sales Profit
1         43       32    29       37       28    78       47       40    14
2         82       54    25       18       15    34       null     null  null
3         64       43    56       5        2     4        null     null  null

I tried versions of this:

df.stack().to_frame().T

It put all the data into one row. So, that's not the goal.
I presume I am close in that it should be a stacking or unstacking, melting or unmelting, but my attempts have all resulted in data oatmeal at this point. Appreciate your time trying to solve this one.

CodePudding user response：

Use pivot:

>>> df.pivot('Location', 'Month').swaplevel(axis=1).sort_index(axis=1)

Month         FEB                   JAN                   MAR             
         Products Profit Sales Products Profit Sales Products Profit Sales
Location                                                                  
1            37.0   78.0  28.0     43.0   20.0  32.0     47.0   14.0  40.0
2            18.0   34.0  15.0     82.0   25.0  54.0      NaN    NaN   NaN
3             5.0    4.0   2.0     64.0   56.0  43.0      NaN    NaN   NaN

To preserve order, you have to transform your Month column as CategoricalDtype before:

df['Month'] = df['Month'].astype(pd.CategoricalDtype(df['Month'].unique(), ordered=True))
out = df.pivot('Location', 'Month').swaplevel(axis=1).sort_index(axis=1)
print(out)

# Output:
Month         JAN                   FEB                   MAR             
         Products Profit Sales Products Profit Sales Products Profit Sales
Location                                                                  
1            43.0   20.0  32.0     37.0   78.0  28.0     47.0   14.0  40.0
2            82.0   25.0  54.0     18.0   34.0  15.0      NaN    NaN   NaN
3            64.0   56.0  43.0      5.0    4.0   2.0      NaN    NaN   NaN

Update 2

Try to force the order of level 2 columns:

df1 = df.set_index(['Month', 'Location'])
df1.columns = pd.CategoricalIndex(df1.columns, ordered=True)
df1 = df1.unstack('Month').swaplevel(axis=1).sort_index(axis=1)

CodePudding user response：

You can use pivot with reorder_levels and sort_index():

df.pivot(index='Location',columns='Month').reorder_levels(order=[1,0],axis=1).sort_index(axis=1)

Month         FEB                   JAN                   MAR             
         Products Profit Sales Products Profit Sales Products Profit Sales
Location                                                                  
1            37.0   78.0  28.0     43.0   20.0  32.0     47.0   14.0  40.0
2            18.0   34.0  15.0     82.0   25.0  54.0      NaN    NaN   NaN
3             5.0    4.0   2.0     64.0   56.0  43.0      NaN    NaN   NaN

In case you are interested, this answer elaborates between swaplevel and reoder_levels.