Home > front end >  how do I select rows from pandas df without returning False values?
how do I select rows from pandas df without returning False values?

Time:05-28

I have a df and I need to select rows based on some conditions in multiple columns.

Here is what I have

import pandas as pd
dat = [('p','q', 5), ('k','j', 2), ('p','-', 5), ('-','p', 4), ('q','pkjq', 3), ('pkjq','q', 2)
df = pd.DataFrame(dat, columns = ['a', 'b', 'c'])
df_dat = df[(df[['a','b']].isin(['k','p','q','j']) & df['c'] > 3)] | df[(~df[['a','b']].isin(['k','p','q','j']) & df['c'] > 2 )]

Expected result = [('p','q', 5), ('p','-', 5), ('-','p', 4), ('q','pkjq', 3)]

Result I am getting is an all false dataframe

CodePudding user response:

When you have the complicate condition I recommend, make the condition outside the slice

cond1 = df[['a','b']].isin(['k','p','q','j']).any(1) & df['c'].gt(3)
cond2 = (~df[['a','b']].isin(['k','p','q','j'])).any(1) & df['c'].gt(2)

out = df.loc[cond1 | cond2]
Out[305]: 
   a     b  c
0  p     q  5
2  p     -  5
3  -     p  4
4  q  pkjq  3
  • Related