Rename specific columns in multiIndex dataframe-CodePudding

I have the following dataframe. It has 2 indices to select rows (samples and epochs) and 2 indices to select columns (kpi and model).

kpi            Accuracy             Precision            Recall             Training time (sec)                 Model memory (MB)               HE Memory (GB)         
model                M0    M1    M2        M0   M1   M2      M0    M1    M2                  M0      M1      M2                M0     M1     M2             M0       M1
samples epochs                                                                                                                                                         
675     3          0.96  0.52  1.00       1.0  0.0  1.0  0.9166  0.00  1.00              0.2124  0.2083  0.2080             0.417  0.417  0.417       0.553547   6.2009
        4          0.96  0.52  1.00       1.0  0.0  1.0  0.9166  0.00  1.00              0.2066  0.2123  0.2137             0.417  0.417  0.417       0.553547   6.2009
1950    3          0.98  0.96  0.98       1.0  1.0  1.0  0.9600  0.92  0.96              0.2132  0.2139  0.2136             0.417  0.417  0.417       1.664447  12.3319
        4          0.98  0.90  0.98       1.0  1.0  1.0  0.9600  0.80  0.96              0.2064  0.2166  0.2152             0.417  0.417  0.417       1.664447  12.3319

The code to achieve this is like so:

tuples = list(zip_longest(shape_ind, epoch_ind))
flat_list = flatten_list(kpi_values)
df = pd.DataFrame(np.reshape(flat_list, (len(kpi_values), -1)))
df.index = pd.MultiIndex.from_tuples(tuples, names=['samples', 'epochs'])

df.columns= pd.MultiIndex.from_arrays(np.divmod(df.columns, len(kpi_values[0][0])), names=['kpi','model'])

df.rename((lambda x: f'M{x}' ), 
        axis=1,
        level=1,
        inplace=True)

kpi = ['Accuracy', 'Precision', 'Recall', 'Training time (sec)', 'Model memory (MB)', 'HE Memory (GB)', 'HE gen. time (sec)']

df.rename(mapper=lambda x: kpi[x], 
        axis=1,
        level=0,
        inplace=True)

print(df)

I want to rename just the last 2 columns and create new groupings, so change from this:

HE Memory (GB)         
M0         M1                                                                                                                                                         
0.553547   6.2009
0.553547   6.2009
1.664447  12.3319
1.664447  12.3319

to this

HE Memory (GB)  HE gen. time (sec)      
                                   <--- note how M0 and M1 are gone                                                                                                                                    
0.553547        6.2009
0.553547        6.2009
1.664447        12.3319
1.664447        12.3319

How can I achieve this while retaining the structure of the original dataframe?

CodePudding user response：

You can try the droplevel method: https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.droplevel.html

df.droplevel(1)

should do the trick.

CodePudding user response：

I ended up with a solution like this:

model_kpi = ['ACC', 'PRC', 'REC', 'TR_T', 'MM']#, 'HE_M', 'HE_GEN_TIME']
he_kpi = ['HE_M', 'HE_GEN_T']
kpi = [ item for item in model_kpi for repetitions in range(len(kpi_values[0][0])) ]   he_kpi
model = ['M' str(i) for i in range(len(kpi_values[0][0]))]*len(model_kpi)   ['',''] 
col_ind = list(zip(kpi, model))
row_ind = list(zip_longest(shape_ind, epoch_ind))
flat_list = flatten_list(kpi_values)
df = pd.DataFrame(np.reshape(flat_list, (len(kpi_values), -1)))
df.index = pd.MultiIndex.from_tuples(row_ind, names=['samples', 'epochs'])
df.columns = pd.MultiIndex.from_tuples(col_ind, names=['kpi', 'model'])