Home > Mobile >  subset dataframe based on match in list
subset dataframe based on match in list

Time:06-20

I want to subset my dataframe by matching against string values in a list. My df looks like this:

df3:
     functions                         main         
1    burger_function  (desc)           c("burger", "fries", "coke", "onion rings", "cheese")    
2    steak_function  (desc)            c("steak", "mash", "jack", "gravy", "cajun_fries")          
3    chicken_function (desc)           c("chicken", "salad", "sprite", "soup")       
4    fish_function (desc)              c("fish", "rice", "water", "garlic_bread")      
   

My first column are functions with a description about them. I want to be able to search for a "main" value and subset the list to show which function it belongs to. So far I have tried this code but I am getting wrong values mixed in with the right ones. Is there a better way to accomplish this?

func_sub <- df3[sapply(df3$main, function(x) x %in% "fries"),]

To end up with something like this:

df3:
     functions                         main         
1    burger_function  (desc)           "fries" 

CodePudding user response:

We may need "fries" %in% x instead of x %in% "fries" because the former one returns a single TRUE/FALSE for each row whereas the one OP used will return a vector of TRUE/FALSE values for each row

df3[sapply(df3$main, function(x) "fries" %in% x),]
           functions                                     main
1 burger_function  (desc) burger, fries, coke, onion rings, cheese

With the OP's code, we may also wrap with any to return a single TRUE/FALSE

df3[sapply(df3$main, function(x) any(x %in% "fries")),]
         functions                                     main
1 burger_function  (desc) burger, fries, coke, onion rings, cheese

Note: This just subsets the rows of the original data and not the elements of the list. If we need to subset the 'main' as well

out$main<- lapply(out$main, function(x) x[x %in% "fries"])
out
                functions  main
1 burger_function  (desc) fries

data

df3 <- structure(list(functions = c("burger_function  (desc)", "steak_function  (desc)", 
"chicken_function (desc)", "fish_function (desc)"), main = list(
    c("burger", "fries", "coke", "onion rings", "cheese"), c("steak", 
    "mash", "jack", "gravy", "cajun_fries"), c("chicken", "salad", 
    "sprite", "soup"), c("fish", "rice", "water", "garlic_bread"
    ))), row.names = c(NA, -4L), class = "data.frame")
  •  Tags:  
  • r
  • Related