I found how to add levels to all the columns but not on specific subsets.
I explain my problem: say that I have a pandas dataframe like this
a b c d
myIndex
0.0 0.1 0.2 -2.0 -0.8
0.1 0.7 1.1 9.0 0.8
0.2 -0.3 1.0 2.3 -0.6
and that I want to add an additional column index level based on columns subsets, e.g. I want something like this:
A B
a b c d
myIndex
0.0 0.1 0.2 -2.0 -0.8
0.1 0.7 1.1 9.0 0.8
0.2 -0.3 1.0 2.3 -0.6
How to do that?
CodePudding user response:
Maybe something like Multiindex can help:
df = pd.DataFrame(np.random.randn(8, 4), columns=pd.MultiIndex.from_tuples([
('A', 'a'), ('A', 'b'), ('B', 'c'), ('B', 'd')], names=['1st', '2nd']))
OUTPUT
1st A B
2nd a b c d
0 1.187084 0.286291 0.151277 0.111996
1 0.383498 0.282514 -2.037380 -0.792589
2 0.193105 -0.421383 -0.956383 -0.667760
3 -1.646689 0.836364 0.270066 -1.667103
4 0.315428 -1.478783 1.442501 0.362000
5 0.763970 0.028874 0.149660 0.726050
6 0.617114 -0.142764 -1.179998 -0.240495
7 0.437346 -0.545978 -0.800455 -0.117596
CodePudding user response:
General Solution
Define a mapping dictionary which maps the level zero column values to level 1 column values, then flatten the dictionary into tuples and create a multiindex
d = {'A': ('a', 'b'), 'B': ('c', 'd')}
df.columns = pd.MultiIndex.from_tuples((k, c) for k, v in d.items() for c in v)
A B
a b c d
myIndex
0.0 0.1 0.2 -2.0 -0.8
0.1 0.7 1.1 9.0 0.8
0.2 -0.3 1.0 2.3 -0.6
CodePudding user response:
In case someone needs it, I solved in this way. I searched quite a lot and mine does not seem an answered question.
More elegant answers than this are welcomed.
col_subset_1 = ['a', 'b']
col_subset_2 = ['c', 'd']
col_subset_1_label = list(zip(['A']*len(col_subset_1), col_subset_1))
col_subset_2_label = list(zip(['B']*len(col_subset_2), col_subset_2))
df.columns = pd.MultiIndex.from_tuples([*col_subset_1_label,*col_subset_1_label])
This solution should be pretty flexible and it should work with every columns subsets (provided that the selected columns exist in the current dataframe).