python pandas: change a column index into a multiindex based on a dicctionary of tuples-CodePudding

I would like to change a column index of a dataframe into a multiindex as follows.

assuming the data:

arrays = [np.array(["2021_01", "2021_01", "2021_02", "2021_02", "2021_03","2021_03","2021_04","2021_04"]),
          np.array(["ALLO", "EPQ", "ALLO", "EPQ","ALLO", "EPQ","ALLO", "EPQ"])]
tuples_i = list(zip(*arrays))
index_rows = pd.MultiIndex.from_tuples(tuples_i, names=["city", "number"])

person_names = ['mike','manu','ana','analia','anomalia','fer']

df = pd.DataFrame(np.random.randn(8, 6), index=index_rows, columns = person_names)
df

given a random dictionary having as keys the column names (which are unique) and some kind of tupple of values:

{k:(k[0:2], k[0:2]   k[0:2]) for k in df.columns}
{'mike': ('mi', 'mimi'),
 'manu': ('ma', 'mama'),
 'ana': ('an', 'anan'),
 'analia': ('an', 'anan'),
 'anomalia': ('an', 'anan'),
 'fer': ('fe', 'fefe')}

How would I create a multiindex in the columns with three levels, the current one and the values of the corresponding tuple of the dictionary?

How should I go about values not present in the dictionary (having a default value ('missing','missing')?

CodePudding user response：

Assuming df the DataFrame (for simplicity), and d the dictionary:

df.columns = pd.MultiIndex.from_arrays([df.columns, *zip(*df.columns.map(d))])

If the dictionary keys are already aligned with the index, this simplifies to:

df.columns = pd.MultiIndex.from_arrays([df.columns, *zip(*d.values())])

Output:


                    mike      manu       ana    analia  anomalia       fer
                      mi        ma        an        an        an        fe
                    mimi      mama      anan      anan      anan      fefe
city    number                                                            
2021_01 ALLO    2.024601  0.217844  2.049117  0.070271  1.089704  1.640760
        EPQ    -0.532718  0.672654 -0.629328  0.079538  0.490475 -0.750072
2021_02 ALLO   -0.035613  1.334336 -1.202566  0.913976  0.043350  0.265392
        EPQ     0.981459 -0.077180  1.737501 -0.986365  0.966194  0.002310
2021_03 ALLO    0.053481 -0.243616 -0.993706  0.549090  1.207434 -0.340995
        EPQ     0.325375 -1.022405 -0.758095  0.315423  0.371075  2.300323
2021_04 ALLO    0.988653  0.368673  0.866023 -0.114965  1.951587  0.371901
        EPQ     0.665541  0.641732 -0.737723 -0.748781  1.056270 -0.546598