I have a Pandas DataFrame with a column index like the one below:
---- ---- ---- ---- ---- ----
| A1| A2| A3| B1| B2| B3|
---- ---- ---- ---- ---- ----
...the data
What I would like to do is to change the column index of this DataFrame to a multi-index one, as shown in the format below, without modifying the data and just simply adding an upper level in the index (with values A
and B
).
-------------- --------------
| A | B |
---- ---- ---- ---- ---- ----
| A1| A2| A3| B1| B2| B3|
---- ---- ---- ---- ---- ----
...the data
I have tried to use the pandas.MultiIndex
function but with no luck. How can this be solved?
CodePudding user response:
You could extract the first letter separately and create a MultiIndex
-
multi_index_level_0 = [c[0] for c in df.columns]
multi_index = [multi_index_level_0, df.columns.values]
df.columns = pd.MultiIndex.from_arrays(multi_index)
CodePudding user response:
Simpliest is extract first value of columns and assign back in nested lists:
df = pd.DataFrame(0, columns=['A1','A2','A3','B1','B2','B3'], index=[0])
df.columns = [df.columns.str[0], df.columns]
print (df)
A B
A1 A2 A3 B1 B2 B3
0 0 0 0 0 0 0
If need extract all uppercases from start:
df = pd.DataFrame(0, columns=['ADa1','ADs2','AD3','B1','B2','B3'], index=[0])
df.columns = [df.columns.str.extract('(^[A-Z] )', expand=False), df.columns]
print (df)
AD B
ADa1 ADs2 AD3 B1 B2 B3
0 0 0 0 0 0 0
If need set also colums names use MultiIndex.from_arrays
:
df = pd.DataFrame(0, columns=['ADa1','ADs2','AD3','B1','B2','B3'], index=[0])
df.columns = pd.MultiIndex.from_arrays([df.columns.str.extract('(^[A-Z] )', expand=False),
df.columns],
names=('a','b'))
print (df)
a AD B
b ADa1 ADs2 AD3 B1 B2 B3
0 0 0 0 0 0 0