Home > Back-end >  Run multiple query conditions from a list in python pandas
Run multiple query conditions from a list in python pandas

Time:11-24

I am making a python PyQt5 CSV comparison tool project and the user can add conditions for querying the pandas dataframe one by one before they are executed.

At the moment I have a nested list of conditions with each element containing the field, operation (==,!=,>,<), and value for comparison as strings. With just one condition I can use .query as it takes a string condition:

data.query('{} {} {}'.format(field,operation,value))

But as far as I can tell the formatting for multiple queries would use loc similar to this:

data.loc[(data.query('{} {} {}'.format(field[0],operation[0],value[0]))) & (data.query('{} {} {}'.format(field[1],operation[1],value[1]))) & ...]

Firstly I wanted to make sure my understanding of the loc function was correct (do I need a primary key at the end maybe?).

And secondly, how would I represent this multiple condition query with an unknown number of conditions set?

Thanks

CodePudding user response:

Would this work?

conds = [
    f'{f} {o} {v}' for f, o, v in zip(field, operation, value)
]
data.query(' and '.join(conds))

CodePudding user response:

Warning: Not tested, more like a comment but put here for proper format:

data.query returns a dataframe, you can't just do dataframe1 & dataframe2. You would do something like

data.query(' AND '.join(['{} {} {}'.format(f, o, v) 
                         for f, o, v in zip(fields, operations, values)
                       ])
          )
  • Related