Given a multiindex df
X
E1_ex0 E1_ex2 E2_ex0 E4_ex0
0 3 4 1 1
1 4 3 2 0
I would like to split the second level at _
and push the second item as the third level as below
X
E1 E2 E4
ex0 ex2 ex0 ex0
0 3 4 1 1
1 4 3 2 0
May I know how to do this?
The original df
can be created as per the code below
arr = np.random.randint(5, size=(2, 4))
tuples = [('X', 'E1_ex0'), ('X', 'E1_ex2'), ('X', 'E2_ex0'), ('X', 'E4_ex0')]
df = pd.DataFrame(data=arr, columns=pd.MultiIndex.from_tuples(tuples))
I try to find on the net, but unable to find relevant resource.
I think the closest example is as in this OP, but I am struggling to split only the second level
CodePudding user response:
You can loop the multiindex columns and construct a third level indexing by unpacking the split list.
tuples = [(index[0], *index[1].split('_')) for index in df.columns]
df2 = pd.DataFrame(data=df.values, columns=pd.MultiIndex.from_tuples(tuples))
# or without creating new DataFrame
df.columns = pd.MultiIndex.from_tuples([(index[0], *index[1].split('_')) for index in df.columns])
print(df2)
X
E1 E2 E4
ex0 ex2 ex0 ex0
0 1 0 3 3
1 0 1 4 3