Home > Software engineering >  Split second level multindex column to create three level column in Pandas
Split second level multindex column to create three level column in Pandas

Time:05-15

Given a multiindex df

       X                     
  E1_ex0 E1_ex2 E2_ex0 E4_ex0
0      3      4      1      1
1      4      3      2      0

I would like to split the second level at _ and push the second item as the third level as below

       X            
       E1      E2  E4
      ex0 ex2 ex0 ex0
    0      3      4      1      1
    1      4      3      2      0

May I know how to do this?

The original df can be created as per the code below

arr = np.random.randint(5, size=(2, 4))
tuples = [('X', 'E1_ex0'), ('X', 'E1_ex2'), ('X', 'E2_ex0'), ('X', 'E4_ex0')]
df = pd.DataFrame(data=arr, columns=pd.MultiIndex.from_tuples(tuples))

I try to find on the net, but unable to find relevant resource.

I think the closest example is as in this OP, but I am struggling to split only the second level

CodePudding user response:

You can loop the multiindex columns and construct a third level indexing by unpacking the split list.

tuples = [(index[0], *index[1].split('_')) for index in df.columns]
df2 = pd.DataFrame(data=df.values, columns=pd.MultiIndex.from_tuples(tuples))

# or without creating new DataFrame

df.columns = pd.MultiIndex.from_tuples([(index[0], *index[1].split('_')) for index in df.columns])
print(df2)

    X
   E1      E2  E4
  ex0 ex2 ex0 ex0
0   1   0   3   3
1   0   1   4   3
  • Related