I have the following dataframe. It has 2 indices to select rows (samples and epochs) and 2 indices to select columns (kpi and model).
kpi Accuracy Precision Recall Training time (sec) Model memory (MB) HE Memory (GB)
model M0 M1 M2 M0 M1 M2 M0 M1 M2 M0 M1 M2 M0 M1 M2 M0 M1
samples epochs
675 3 0.96 0.52 1.00 1.0 0.0 1.0 0.9166 0.00 1.00 0.2124 0.2083 0.2080 0.417 0.417 0.417 0.553547 6.2009
4 0.96 0.52 1.00 1.0 0.0 1.0 0.9166 0.00 1.00 0.2066 0.2123 0.2137 0.417 0.417 0.417 0.553547 6.2009
1950 3 0.98 0.96 0.98 1.0 1.0 1.0 0.9600 0.92 0.96 0.2132 0.2139 0.2136 0.417 0.417 0.417 1.664447 12.3319
4 0.98 0.90 0.98 1.0 1.0 1.0 0.9600 0.80 0.96 0.2064 0.2166 0.2152 0.417 0.417 0.417 1.664447 12.3319
The code to achieve this is like so:
tuples = list(zip_longest(shape_ind, epoch_ind))
flat_list = flatten_list(kpi_values)
df = pd.DataFrame(np.reshape(flat_list, (len(kpi_values), -1)))
df.index = pd.MultiIndex.from_tuples(tuples, names=['samples', 'epochs'])
df.columns= pd.MultiIndex.from_arrays(np.divmod(df.columns, len(kpi_values[0][0])), names=['kpi','model'])
df.rename((lambda x: f'M{x}' ),
axis=1,
level=1,
inplace=True)
kpi = ['Accuracy', 'Precision', 'Recall', 'Training time (sec)', 'Model memory (MB)', 'HE Memory (GB)', 'HE gen. time (sec)']
df.rename(mapper=lambda x: kpi[x],
axis=1,
level=0,
inplace=True)
print(df)
I want to rename just the last 2 columns and create new groupings, so change from this:
HE Memory (GB)
M0 M1
0.553547 6.2009
0.553547 6.2009
1.664447 12.3319
1.664447 12.3319
to this
HE Memory (GB) HE gen. time (sec)
<--- note how M0 and M1 are gone
0.553547 6.2009
0.553547 6.2009
1.664447 12.3319
1.664447 12.3319
How can I achieve this while retaining the structure of the original dataframe?
CodePudding user response:
You can try the droplevel method: https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.droplevel.html
df.droplevel(1)
should do the trick.
CodePudding user response:
I ended up with a solution like this:
model_kpi = ['ACC', 'PRC', 'REC', 'TR_T', 'MM']#, 'HE_M', 'HE_GEN_TIME']
he_kpi = ['HE_M', 'HE_GEN_T']
kpi = [ item for item in model_kpi for repetitions in range(len(kpi_values[0][0])) ] he_kpi
model = ['M' str(i) for i in range(len(kpi_values[0][0]))]*len(model_kpi) ['','']
col_ind = list(zip(kpi, model))
row_ind = list(zip_longest(shape_ind, epoch_ind))
flat_list = flatten_list(kpi_values)
df = pd.DataFrame(np.reshape(flat_list, (len(kpi_values), -1)))
df.index = pd.MultiIndex.from_tuples(row_ind, names=['samples', 'epochs'])
df.columns = pd.MultiIndex.from_tuples(col_ind, names=['kpi', 'model'])