As an example, I have the next dataset (fake random data) -
Index | category | value |
---|---|---|
1 | dog | 5 |
2 | cat | 22 |
3 | Tasselled Wobbegong | 44 |
4 | cat | 66 |
5 | Tasselled Wobbegong | 5 |
6 | dog | 23 |
I have this in a vaex dataframe. Now imagine I have 10,000 categories not only 3. I want to filter my vaex dataframe by a list of categories. like so:
filter_category_list = ['cat','dog']
df = df[df.category in filter_category_list ]
(the code above doesn't work I imagine it would be similar to this) I expect my output to be:
Index | category | value |
---|---|---|
1 | dog | 5 |
2 | cat | 22 |
4 | cat | 66 |
6 | dog | 23 |
Any idea how to achieve that with vaex?
Thanks for taking the time to read!
CodePudding user response:
Here are some solutions for that.
df.query("category in @filter_category_list")
df[df['category'].apply(lambda x: x in filter_category_list)]
df[df['category'].isin(filter_category_list)]