Home > Mobile >  Pandas dataframe selecting with index and condition on a column
Pandas dataframe selecting with index and condition on a column

Time:12-16

I am trying for a while to solve this problem:

I have a daraframe like this:

import pandas as pd
df=pd.DataFrame(np.array([['A', 2, 3], ['B', 5, 6], ['C', 8, 9]]),columns=['a', 'b', 'c'])
j=[0,2]

But then when i try to select just a part of it filtering by a list of index and a condition on a column I get error...

df[df.loc[j]['a']=='A']

There is somenting wrong, but i don't get what is the problem here. Can you help me?

This is the error message:

IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).

CodePudding user response:

There is filtered DataFrame compared by original, so indices are different, so error is raised.

You need compare filtered DataFrame:

df1 = df.loc[j]
print (df1)
   a  b  c
0  A  2  3
2  C  8  9

out = df1[df1['a']=='A']
print(out)
   a  b  c
0  A  2  3

Your solution is possible use with convert ndices of filtered mask by original indices by Series.reindex:

out = df[(df.loc[j, 'a']=='A').reindex(df.index, fill_value=False)]
print(out)
   a  b  c
0  A  2  3

Or nicer solution:

out = df[(df['a'] == 'A') & (df.index.isin(j))]
print(out)
   a  b  c
0  A  2  3

CodePudding user response:

A boolean array and the dataframe should be the same length. here your df length is 3 but the boolean array df.loc[j]['a']=='A' length is 2

You should do:

>>> df.loc[j][df.loc[j]['a']=='A']
   a  b  c
0  A  2  3

  • Related