Create a function that selects rows in a data frame based on if the elements of two column belong to-CodePudding

I have a dataframe s1

s1=data.frame(no=c(1,2,3,4,5,6,7),col=c("red","green","blue","yellow","blue","black","white"),car_mod=c("car2","car4","car1","car5","car7","car3","car1"))

  no    col car_mod
1  1    red    car2
2  2  green    car4
3  3   blue    car1
4  4 yellow    car5
5  5   blue    car7
6  6  black    car3
7  7  white    car1

and a list l

l=list(list(c("green","blue","red"),c("car1","car2","car5")))

[[1]]
[[1]][[1]]
[1] "green" "blue"  "red"  

[[1]][[2]]
[1] "car1" "car2" "car5"

I want to create a function which only selects the rows in which the element in the column "col" and the element in the column "car_mod" are present in the list ( the element in col should be present in l[1][1] while car_mod should be present in l[1][2])

The output should look something like this

s_new=data.frame(no=c(1,3),col=c("red","blue"),car_mod=c("car2","car1"))

  no  col car_mod
1  1  red    car2
2  3 blue    car1

Note, the actual dataframe and list are very large. I tried doing something like this

for(i in l[1]){
  for(j in l[2]){ 
    if(i %in% s1$col & j %in% s1$car_mod){
      select()
    }
   
   
  }
  
}

But im not sure how to proceed or if using loops is the best approach due to the size of the dataframe

CodePudding user response：

You can use subset (or dplyr::filter):

> subset(s1, col %in% l[[1]][[1]] & car_mod %in% l[[1]][[2]])

  no  col car_mod
1  1  red    car2
3  3 blue    car1

CodePudding user response：

A posible solution with filter:

s_new <- s1 %>% filter(col %in% l[[1]][1][[1]] & car_mod %in% l[[1]][2][[1]])

s_new

  no  col car_mod
1  1  red    car2
2  3 blue    car1

CodePudding user response：

To get rid of the [[ you can also use pluck() from the purrr package:

library(tidyverse)

s_new <- s1 %>% filter(col %in% pluck(pluck(l, 1), 1) & car_mod %in% pluck(pluck(l, 1), 2))