Home > OS >  using R to mutate and map conditional on value of grouping variable
using R to mutate and map conditional on value of grouping variable

Time:02-15

Assume the following example workflow. Such code will allow to map a function over grouped variables

df <- tibble(group1 = rep(letters[1:10],100),
             group2 = rep(letters[1:10],100),
             var1 = rnorm(1000),
             var2 = rnorm(1000)) %>% 
group_by(group1,group2) %>% 
  nest() %>% 
  mutate(model = map(data, ~lm(var1 ~ var2, .)))

What I want to do is mutate and map conditional on the value of the grouping variable. That is for example:

  mutate(model = map(data, ~lm(var1 ~ var2, .))) 

when group2 %in% c("a","b","c") and

  mutate(model = map(data, ~lm(var1 ~ 1, .))) 

when group2 NOT in c("a","b","c")

CodePudding user response:

You can use the function purrr::map_if() to accomplish this. It takes a predicate function and can perform different functions whether the predicate is TRUE or FALSE, like this:

purrr::map_if(
      .x = data, 
      .p = ~ group2 %in% c("a", "b", "c"),
      .f = ~lm(var1 ~ var2, .x), 
      .else = ~lm(var1 ~ 1, .x)
    )

Full reprex

Here is a reprex based on your data (I add a column to verify that the logic is correct):

library(dplyr, warn.conflicts = FALSE)

tibble(
  group1 = rep(letters[1:10],100),
  group2 = rep(letters[1:10],100),
  var1 = rnorm(1000),
  var2 = rnorm(1000)
) %>% 
  group_by(group1, group2) %>% 
  tidyr::nest() %>% 
  mutate(
    model = purrr::map_if(
      .x = data, 
      .p = ~ group2 %in% c("a", "b", "c"),
      .f = ~lm(var1 ~ var2, .x), 
      .else = ~lm(var1 ~ 1, .x)
    )
  ) %>%
  # Note: I add this column to verify the logic
  mutate(
    formula = purrr::map_chr(.x = model, ~.x$call %>% rlang::as_label())
  )
#> # A tibble: 10 x 5
#> # Groups:   group1, group2 [10]
#>    group1 group2 data               model  formula                             
#>    <chr>  <chr>  <list>             <list> <chr>                               
#>  1 a      a      <tibble [100 x 2]> <lm>   lm(formula = var1 ~ var2, data = .x)
#>  2 b      b      <tibble [100 x 2]> <lm>   lm(formula = var1 ~ var2, data = .x)
#>  3 c      c      <tibble [100 x 2]> <lm>   lm(formula = var1 ~ var2, data = .x)
#>  4 d      d      <tibble [100 x 2]> <lm>   lm(formula = var1 ~ 1, data = .x)   
#>  5 e      e      <tibble [100 x 2]> <lm>   lm(formula = var1 ~ 1, data = .x)   
#>  6 f      f      <tibble [100 x 2]> <lm>   lm(formula = var1 ~ 1, data = .x)   
#>  7 g      g      <tibble [100 x 2]> <lm>   lm(formula = var1 ~ 1, data = .x)   
#>  8 h      h      <tibble [100 x 2]> <lm>   lm(formula = var1 ~ 1, data = .x)   
#>  9 i      i      <tibble [100 x 2]> <lm>   lm(formula = var1 ~ 1, data = .x)   
#> 10 j      j      <tibble [100 x 2]> <lm>   lm(formula = var1 ~ 1, data = .x)
  • Related