I have an excel like below
A B C
x y
1 abc 3 5
2 abc 4 6
1 def 5 7
1 qrs 6 8
2 def 7 9
When I read this with pandas using the code df= pd.read_excel(r'C:\Users\mys2kor\Desktop\Mappe1.xlsx',engine='openpyxl',header=[0,1])
I get the dataframe like so
A B C
Unnamed: 0_level_1 Unnamed: 1_level_1 x y
1 abc 3 5
2 abc 4 6
1 def 5 7
1 qrs 6 8
2 def 7 9
I want to filter Column a for 1 and column B for 'def' & get just one row as output.
When I do df["A"].filter(like="Unnamed")==1
and df["B"].filter(like="Unnamed")=='def'
separately, I get the proper boolean values
Unnamed: 0_level_1
True
False
True
True
False
and
Unnamed: 1_level_1
False
False
True
False
True
But when I put them together like so (df["A"].filter(like="Unnamed")==1) & (df["B"].filter(like="Unnamed")=='def')
, I get like below!
Unnamed: 0_level_1 Unnamed: 1_level_1
False False
False False
False False
False False
False False
How do I filter such that the output is
A B C
x y
1 def 5 7
CodePudding user response:
It looks like you could use the columns A
and B
as index while reading the excel file, then use indexing with loc
to query the index values:
df = pd.read_excel('...', header=[0, 1], index_col=[0, 1])
df.loc[[(1, 'def')]]