Home > Enterprise >  Filter a column which has multilevel column header in Pandas
Filter a column which has multilevel column header in Pandas

Time:06-18

I have an excel like below

enter image description here

A   B   C   
        x   y
1   abc 3   5
2   abc 4   6
1   def 5   7
1   qrs 6   8
2   def 7   9

When I read this with pandas using the code df= pd.read_excel(r'C:\Users\mys2kor\Desktop\Mappe1.xlsx',engine='openpyxl',header=[0,1])

I get the dataframe like so

A   B   C
Unnamed: 0_level_1  Unnamed: 1_level_1  x   y
1   abc     3   5
2   abc     4   6
1   def     5   7
1   qrs     6   8
2   def     7   9

enter image description here

I want to filter Column a for 1 and column B for 'def' & get just one row as output.

When I do df["A"].filter(like="Unnamed")==1 and df["B"].filter(like="Unnamed")=='def' separately, I get the proper boolean values

Unnamed: 0_level_1
True
False
True
True
False

and

Unnamed: 1_level_1
False
False
True
False
True

But when I put them together like so (df["A"].filter(like="Unnamed")==1) & (df["B"].filter(like="Unnamed")=='def'), I get like below!

Unnamed: 0_level_1  Unnamed: 1_level_1
False   False
False   False
False   False
False   False
False   False

How do I filter such that the output is

A   B   C   
        x   y
1   def 5   7

CodePudding user response:

It looks like you could use the columns A and B as index while reading the excel file, then use indexing with loc to query the index values:

df = pd.read_excel('...', header=[0, 1], index_col=[0, 1])
df.loc[[(1, 'def')]]
  • Related