Home > Blockchain >  How to change the entry of a MultiIndex columns pandas DataFrame?
How to change the entry of a MultiIndex columns pandas DataFrame?

Time:11-17

I have a dataframe df such that df.columns returns

   MultiIndex([( 'a', 's1', 'm/s'),
            ( 'a', 's2',   '%'),
            ( 'a', 's3',  '°C'),
            ('b', 'z3',  '°C'),
            ('b', 'z4', 'kPa')],
           names=['kind', 'names', 'units'])

How to change ONLY the column name ('b', 'z3', '°C') into ('b', 'z3', 'degC')?

At the moment I am trying the following

        old_cols_name = ('b', 'z3',  '°C')
        new_cols_name = list(old_cols_name)
        new_cols_name[2] = "degC"
        df = df_temp.rename(
            columns={old_cols_name: new_cols_name},
        )

But it does not work in the sense that the columns names of df are left unchanged.

EDIT: Question slightly changed to add more generality.

CodePudding user response:

You can't modify a MultiIndex, so you will have to recreate it.

An handy way might be to transform back and forth to DataFrame.

Assuming idx the MultiIndex:

new_idx = pd.MultiIndex.from_frame(idx.to_frame().replace('°C', 'degC'))

Or use the DataFrame constructor:

new_idx = pd.DataFrame(index=idx).rename({'°C': 'degC'}).index

Note that you can limit to a given level with:

new_idx = pd.Series(index=idx).rename({'°C': 'degC'}, level='units').index

Output:

MultiIndex([('a', 's1',  'm/s'),
            ('a', 's2',    '%'),
            ('a', 's3', 'degC'),
            ('b', 'z3', 'degC'),
            ('b', 'z4',  'kPa')],
           names=['kind', 'names', 'units'])

CodePudding user response:

I believe changing the desired column alone would work:

df=pd.DataFrame(columns= pd.MultiIndex.from_tuples([( 'a', 's1', 'm/s'),
               ( 'a', 's2',   '%'),
               ( 'a', 's3',  'degC'),
               ('b', 'z3',  '°C'),
               ('b', 'z4', 'kPa')],
           names=['kind', 'names', 'units']))
MultiIndex([('a', 's1',  'm/s'),
            ('a', 's2',    '%'),
            ('a', 's3', 'degC'),
            ('b', 'z3',   '°C'),
            ('b', 'z4',  'kPa')],
           names=['kind', 'names', 'units'])

using:

df = df.rename(columns={'°C': 'degC'}, level=2)
MultiIndex([('a', 's1',  'm/s'),
            ('a', 's2',    '%'),
            ('a', 's3', 'degC'),
            ('b', 'z3', 'degC'),
            ('b', 'z4',  'kPa')],
           names=['kind', 'names', 'units'])

In the edited question you mentioned you only want to change one of the '°C' to 'degC'. This is not possible as in MultiIndex the two '°C' are considered as the same label. Essentially the structure of the MultiIndex would need to be changed, instead of just a name change. To do that you have to reconstruct the MultiIndex:

new_idx = df.columns.to_numpy()
new_idx[3] = ('b', 'z3',  'degC')
df.columns = pd.MultiIndex.from_tuples(new_idx, names = df.columns.names)
  • Related