Home > Software design >  How to filter a vaex dataset by a list of numbers/categories
How to filter a vaex dataset by a list of numbers/categories

Time:06-19

As an example, I have the next dataset (fake random data) -

Index category value
1 dog 5
2 cat 22
3 Tasselled Wobbegong 44
4 cat 66
5 Tasselled Wobbegong 5
6 dog 23

I have this in a vaex dataframe. Now imagine I have 10,000 categories not only 3. I want to filter my vaex dataframe by a list of categories. like so:

filter_category_list = ['cat','dog']
df = df[df.category in filter_category_list ]

(the code above doesn't work I imagine it would be similar to this) I expect my output to be:

Index category value
1 dog 5
2 cat 22
4 cat 66
6 dog 23

Any idea how to achieve that with vaex?

Thanks for taking the time to read!

CodePudding user response:

Here are some solutions for that.

df.query("category in @filter_category_list")
df[df['category'].apply(lambda x: x in filter_category_list)]
df[df['category'].isin(filter_category_list)]
  • Related