I want to filter a dataframe in which column having list of values. I want to filter on multiple conditions. How can I do that?
>>> my_df
col values
0 c1 [1,2,3]
1 c2 ['a', 'b', 'c']
2 c3 [11,12,13]
>>> my_df.query(df.query(" `c2` in '['a','b','c']' "))
I expect the output to be
col values
1 c2 ['a', 'b', 'c']
CodePudding user response:
This is another work around.
def list_equals(lst, element):
return lst == element
my_df["equals"] = my_df["values"].apply(list_equals, element=['a', 'b', 'c'])
filtered_df = my_df.query("equals == True")
filtered_df = filtered_df.drop("equals", axis=1)
print(filtered_df)
Output:
col values
1 c2 [a, b, c]
To produce the values
output as list of strings you can use the apply()
function one more time.
filtered_df["values"] = filtered_df["values"].apply(str)
print(filtered_df)
Output:
col values
1 c2 ['a', 'b', 'c']
CodePudding user response:
You should convert the lists to tuple
print(my_df[my_df['values'].apply(tuple) == ('a', 'b', 'c')])
To use query
change values
column to tuple
first
my_df['values'] = my_df['values'].apply(tuple)
t = ('a', 'b', 'c')
print(my_df.query('values == @t'))
Output:
col values
1 c2 [a, b, c]