Home > Mobile >  Create multiindex columns based on column names and positions
Create multiindex columns based on column names and positions

Time:10-08

I have a pandas dataframe with column names like:

id name ... class maths_marks1 maths_marks2 eng_marks1 eng_marks2

Now I want to create Multiindex columns like: first column is id so it remains as is, then maths and eng in level0 while marks1 and marks2 in level1, and the rest of the columns under student_data. So my final dataframe should look like:

   student_data   maths         eng
id name ... class marks1 marks2 marks1 marks2

Do note that ... indicate that there are a lot of columns between name and class, therefore I would like to specify them by position and not name (something like starting from the second column to the column just before the one which contains maths).

What's the best way to achieve this?

CodePudding user response:

Prepend student_data_ in each col which is not contain '_' and split columns to convert them as MultiIndex:

df.columns = ['student_data_'   col if '_' not in col else col
                  for col in df.rename(columns={'id': '_id'}).columns]
df.columns = df.columns.str.rsplit('_', 1).map(tuple)
# Before
>>> df
   id   name class  maths_marks1  maths_marks2  eng_marks1  eng_marks2
0   1  Louis    CP            20            19          18          17

# After
>>> df
     student_data        maths           eng       
  id         name class marks1 marks2 marks1 marks2
0  1        Louis    CP     20     19     18     17
  • Related