I have a large dataframe (named P) with productnumbers and i want to filter the dataframe with another dataframe (named Numb) of numbers.
Productnumb Price Store
1000 2 A
1001 9 D
1002 1 B
1003 3 A
...
9999 8 D
The numb dataframe could also be a list. numb Dataframe below:
Good_products
1003
1009
...
9999
Expected output
Good_products Price Store
1003 3 A
1009 11 G
...
9999 8 D
i tried P <- P %>% filter(Productnumb == Numb$Good_products)
I know it is working for 1 or 2 conditions but i have to filter on a lot of conditions, so is there a possibility to filter on large list/dataframes of numbers (as condition)?
CodePudding user response:
semi_join
might be a quick and easy way to "filter" by a list of good products:
library(dplyr)
# Some simulated data
P <- tibble(
ProductNum = 1000:1999,
Price = sample(1:20, 1000, replace = TRUE),
Store = sample(LETTERS, 1000, replace = TRUE)
)
# A sample of "good products"
Numb <- tibble(GoodProducts = sample(P$ProductNum, 400))
P %>%
semi_join(Numb, by = c("ProductNum" = "GoodProducts"))
#> # A tibble: 400 x 3
#> ProductNum Price Store
#> <int> <int> <chr>
#> 1 1001 14 H
#> 2 1004 9 Q
#> 3 1008 6 B
#> 4 1014 17 Z
#> 5 1018 18 Q
#> 6 1020 3 P
#> 7 1026 14 G
#> 8 1031 16 L
#> 9 1034 2 W
#> 10 1037 14 P
#> # ... with 390 more rows
Other steps will sort of depend on how you want to filter. You can pipe a filter (such as Price < 10
) as a next step, or do further semi_join
s. This admittedly is just another way of doing what you had noted above so not sure if it solves your problem. Adding lots of conditions to filter
should work fine as well.
Created on 2022-04-20 by the reprex package (v2.0.1)