The objective is to slice level 0 with a
in it.
However, when slicing the multiindex column as below,
a_cols=df.loc[:,('a',slice(None))]
the compiler return
TypeError: unhashable type: 'slice'
import pandas as pd
import numpy as np
np.random.seed(0)
arr=np.random.randint(5, size=(2, 12))
df=pd.DataFrame(arr,columns=[('a','E1_g1'),('a','E1_g2'),('a','E1_g3'),
('a','E2_g1'),('a','E2_g2'),('a','E2_g3'),
('a','E3_g1'),('a','E3_g2'),('a','E3_g3'),
('b','E1'),('b','E1'),('b','E13'),])
May I know what did I do wrong here?
I also tried
df.loc[:, df.columns.get_level_values(0) == 'a']
But, an empty df
is produced instead.
CodePudding user response:
That's because df.columns
is not MultiIndex. You can index it using a list created by filtering df.columns
:
cols = [(i,j) for (i,j) in df.columns if i=='a']
out = df[cols]
Output:
(a, E1_g1) (a, E1_g2) (a, E1_g3) (a, E2_g1) (a, E2_g2) (a, E2_g3) (a, E3_g1) (a, E3_g2) (a, E3_g3)
0 4 0 3 3 3 1 3 2 4
1 2 1 0 1 1 0 1 4 3
You can make df.columns
MultiIndex using MultiIndex.from_tuples
. Then your slicing method works:
df.columns = pd.MultiIndex.from_tuples(df.columns)
a_cols = df.loc[:,('a',slice(None))]
Output:
a
E1_g1 E1_g2 E1_g3 E2_g1 E2_g2 E2_g3 E3_g1 E3_g2 E3_g3
0 4 0 3 3 3 1 3 2 4
1 2 1 0 1 1 0 1 4 3