Home > Enterprise >  Filter on list of string values in column using pandas df.query()
Filter on list of string values in column using pandas df.query()

Time:01-03

I want to filter a dataframe in which column having list of values. I want to filter on multiple conditions. How can I do that?

>>> my_df
        col   values
0         c1  [1,2,3]
1         c2  ['a', 'b', 'c']
2         c3  [11,12,13]


>>> my_df.query(df.query(" `c2` in '['a','b','c']' "))

I expect the output to be

       col   values
1         c2  ['a', 'b', 'c']

CodePudding user response:

This is another work around.

def list_equals(lst, element):
    return lst == element
    
my_df["equals"] = my_df["values"].apply(list_equals, element=['a', 'b', 'c'])

filtered_df = my_df.query("equals == True")
filtered_df = filtered_df.drop("equals", axis=1)

print(filtered_df)

Output:

   col     values
1   c2  [a, b, c]

To produce the values output as list of strings you can use the apply() function one more time.

filtered_df["values"] = filtered_df["values"].apply(str)
print(filtered_df)

Output:

  col           values
1  c2  ['a', 'b', 'c']

CodePudding user response:

You should convert the lists to tuple

print(my_df[my_df['values'].apply(tuple) == ('a', 'b', 'c')])

To use query change values column to tuple first

my_df['values'] = my_df['values'].apply(tuple)
t = ('a', 'b', 'c')
print(my_df.query('values == @t'))

Output:

  col     values
1  c2  [a, b, c]
  • Related