Home > Software design >  work with quoted variable names in a R summarize ifelse statement (programming in dplyr)
work with quoted variable names in a R summarize ifelse statement (programming in dplyr)

Time:06-22

I am assigning variable names programmatically in a custom function I am building in R. This is working fine. However if I then want to access that same variable in an ifelse() statement, it doesnt work. Basically, How can I translate this code:

library(tidyverse)
df %>% 
  group_by(ID)  %>%
  summarise(
    first_adopted_test_1 =  ifelse(never_adopted , "never adopted",  date[first_adopted  == TRUE])
  ) %>% 
  distinct() %>%
  mutate(
    first_adopted_test_1 =  ifelse(
      first_adopted_test_1== '2019_01',
      "before 2019",
      first_adopted_test_1
    )
  ) 
# A tibble: 4 x 2
# Groups:   ID [4]
     ID first_adopted_test_1
  <dbl> <chr>               
1     1 2021_06             
2     2 never adopted       
3     3 2020_05             
4     4 before 2019    

To this

my_cat <- "test_1"

df %>% 
  group_by(ID)  %>%
  summarise(
    !!paste0("first_adopted_", quo_name(my_cat)):= ifelse(never_adopted , "never adopted",  date[first_adopted  == TRUE])
  ) %>% 
  distinct() %>%
  ## this bottom part is what does not work
  mutate(
    !!paste0("first_adopted_", quo_name(my_cat)):=  ifelse(
      !!paste0("first_adopted_", quo_name(my_cat))== '2019_01',
      "before 2019",
      !!paste0("first_adopted_", quo_name(my_cat))
    )
  )

example data

df <- structure(list(ID = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 
3, 3, 4, 4, 4, 4, 4), date = c("2019_01", "2019_02", "2019_03", 
"2020_05", "2021_06", "2019_01", "2019_02", "2019_03", "2020_05", 
"2021_06", "2019_01", "2019_02", "2019_03", "2020_05", "2021_06", 
"2019_01", "2019_02", "2019_03", "2020_05", "2021_06"), value = c(0L, 
0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), first_adopted = c(FALSE, FALSE, FALSE, FALSE, TRUE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, 
FALSE, TRUE, FALSE, FALSE, FALSE, FALSE), never_adopted = c(FALSE, 
FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-20L))

CodePudding user response:

This seems to work:

df %>% 
  group_by(ID)  %>%
  summarise(
    !!paste0("first_adopted_", quo_name(my_cat)):= ifelse(never_adopted , "never adopted",  date[first_adopted  == TRUE])
  ) %>% 
  distinct() %>%
  mutate(
    !!paste0("first_adopted_", quo_name(my_cat)):=  ifelse(
      get(paste0("first_adopted_", quo_name(my_cat)))== '2019_01',
      "before 2019",
      get(paste0("first_adopted_", quo_name(my_cat)))
    )
  )
#> `summarise()` has grouped output by 'ID'. You can override using the `.groups`
#> argument.
#> # A tibble: 4 × 2
#> # Groups:   ID [4]
#>      ID first_adopted_test_1
#>   <dbl> <chr>               
#> 1     1 2021_06             
#> 2     2 never adopted       
#> 3     3 2020_05             
#> 4     4 before 2019

Created on 2022-06-21 by the reprex package (v2.0.1)

  • Related