Home > Back-end >  Create a function that selects rows in a data frame based on if the elements of two column belong to
Create a function that selects rows in a data frame based on if the elements of two column belong to


I have a dataframe s1


  no    col car_mod
1  1    red    car2
2  2  green    car4
3  3   blue    car1
4  4 yellow    car5
5  5   blue    car7
6  6  black    car3
7  7  white    car1

and a list l


[1] "green" "blue"  "red"  

[1] "car1" "car2" "car5"

I want to create a function which only selects the rows in which the element in the column "col" and the element in the column "car_mod" are present in the list ( the element in col should be present in l[1][1] while car_mod should be present in l[1][2])

The output should look something like this


  no  col car_mod
1  1  red    car2
2  3 blue    car1

Note, the actual dataframe and list are very large. I tried doing something like this

for(i in l[1]){
  for(j in l[2]){ 
    if(i %in% s1$col & j %in% s1$car_mod){

But im not sure how to proceed or if using loops is the best approach due to the size of the dataframe

CodePudding user response:

You can use subset (or dplyr::filter):

> subset(s1, col %in% l[[1]][[1]] & car_mod %in% l[[1]][[2]])

  no  col car_mod
1  1  red    car2
3  3 blue    car1

CodePudding user response:

A posible solution with filter:

s_new <- s1 %>% filter(col %in% l[[1]][1][[1]] & car_mod %in% l[[1]][2][[1]])


  no  col car_mod
1  1  red    car2
2  3 blue    car1

CodePudding user response:

To get rid of the [[ you can also use pluck() from the purrr package:


s_new <- s1 %>% filter(col %in% pluck(pluck(l, 1), 1) & car_mod %in% pluck(pluck(l, 1), 2))
  • Related