Home > Blockchain >  Dplyr logical membership test of nested data
Dplyr logical membership test of nested data

Time:09-30

I'm trying to create a logical test for membership of a dataframe variable in a nested column. Using mtcars as a stand-in, I can generally replicate what I'm trying to do (though the process using may seem inefficient/circuitous since it's not my real data):

library(dplyr)
m <- mtcars %>%
  group_by(cyl) %>% 
  summarize(grz = unique(gear)) %>% 
  nest(data = c(cyl))

Which produces a nested column of cylinders (data) associated with the grz variable:

# A tibble: 3 x 2
    grz data                
  <dbl> <list>              
1     4 <grouped_df [2 x 1]>
2     3 <grouped_df [3 x 1]>
3     5 <grouped_df [3 x 1]>

I want to add a column testing if the value of grz is present in the nested data column, and can't seem to figure out why this doesn't work:

library(purrr)
m %>% mutate(test = map2_lgl(.x = data, .y = grz, ~ .y %in% .x))

# A tibble: 3 x 3
    grz data                 test 
  <dbl> <list>               <lgl>
1     4 <grouped_df [2 x 1]> FALSE
2     3 <grouped_df [3 x 1]> FALSE
3     5 <grouped_df [3 x 1]> FALSE

The first row of grz (value of 4) should produce a TRUE boolean, while the other two should be FALSE.

CodePudding user response:

We need to extract the column as %in% table should be vector or matrix

library(dplyr)
library(purrr)
m %>%
   mutate(test = map2_lgl(data, grz, ~ .y %in% .x$cyl))

-output

# A tibble: 3 × 3
    grz data                 test 
  <dbl> <list>               <lgl>
1     4 <grouped_df [2 × 1]> TRUE 
2     3 <grouped_df [3 × 1]> FALSE
3     5 <grouped_df [3 × 1]> FALSE
  • Related