Home > Software design >  Pivot from tabular to matrix with row and column multiindex and given order
Pivot from tabular to matrix with row and column multiindex and given order

Time:07-16

Given the following dataframe containing a matrix in tabular format:

values = list(range(5))
var1 = ['01', '02', '03', '0', '0']
var2 = ['a', 'b', 'c', 'd', 'e']
var3 = ['01', '02', '03', '0', '0']
var4 = ['a', 'b', 'c', 'd', 'e']
var5 = ['S1', 'S1','S1', 'S3', 'S2']
var6 = ['P1', 'P1','P1', 'P3', 'P2']

df = pd.DataFrame({'var1': var1,
                   'var2': var2,
                   'var3': var3,
                   'var4': var4,
                   'var5': var5,
                   'var6': var6,
                   'value': values})

And the following imposed order for var5 and var6:

var5_order = ['S1', 'S2', 'S3', 'S4']
var6_order = ['P1', 'P2', 'P3', 'P4']

How to pivot the dataframe in a way that var6, var1, and var2 (in this order) define a row multiindex and var5, var3, and var4 (in this order) define a column multiindex? In addition, how to impose var5_order and var6_order in the pivoted dataframe?

CodePudding user response:

Yes, you can do pd.Categorical with ordered=True:

df['var5'] = pd.Categorical(df['var5'], categories=var5_order, ordered=True)
df['var6'] = pd.Categorical(df['var6'], categories=var6_order, ordered=True)

df.pivot_table(index=['var6','var1','var2'], 
               columns=['var5','var3','var4'], 
               values='value')

Output:

var5             S1             S2   S3
var3             01   02   03    0    0
var4              a    b    c    e    d
var6 var1 var2                         
P1   01   a     0.0  NaN  NaN  NaN  NaN
     02   b     NaN  1.0  NaN  NaN  NaN
     03   c     NaN  NaN  2.0  NaN  NaN
P2   0    e     NaN  NaN  NaN  4.0  NaN
P3   0    d     NaN  NaN  NaN  NaN  3.0
  • Related