Home > OS >  How to refer to an argument as character in dplyr filter inside a function
How to refer to an argument as character in dplyr filter inside a function

Time:03-22

I am trying to build a function for calculating percentages for certain variables - but I am struggling to refer to an argument as a character value inside quotations as I need to use it inside a filter verb. I have the dataset below.

e1_done <- structure(list(koen_new = c("Kvinde", "Kvinde", "Mand", "Kvinde", 
                                "Mand", "Mand", "Kvinde", "Kvinde", "Mand", "Mand", "Kvinde", 
                                "Kvinde", "Kvinde", "Mand", "Mand", "Mand", "Kvinde", "Kvinde", 
                                "Mand", "Kvinde", "Mand", "Mand", "Kvinde", "Kvinde", "Mand", 
                                "Mand", "Kvinde", "Mand", "Kvinde", "Kvinde", "Mand", "Kvinde", 
                                "Kvinde", "Mand", "Mand", "Kvinde", "Kvinde", "Mand", "Mand", 
                                "Mand", "Mand", "Mand", "Mand", "Mand", "Mand", "Kvinde", "Mand", 
                                "Kvinde", "Kvinde", "Kvinde"), 
frvlg_1 = structure(c(0, 0, 0, 
                                                                                     0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
                                                                                     0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 
                                                                                     0, 0, 0, 0, 0))), row.names = c(NA, -50L), class = c("tbl_df", "tbl", "data.frame"))

    # A tibble: 50 × 2
       koen_new frvlg_1
       <chr>      <dbl>
     1 Kvinde         0
     2 Kvinde         0
     3 Mand           0
     4 Kvinde         0
     5 Mand           0
     6 Mand           0
     7 Kvinde         1
     8 Kvinde         0
     9 Mand           0
    10 Mand           0
    # … with 40 more rows

I have built the following function:

per.gender <- function(x) {
  e1_done %>% 
    group_by(koen_new) %>% 
    mutate(total_n_gender = n()) %>% 
    group_by(koen_new,{{x}}) %>% 
    mutate(n_frvl = n()) %>% 
    dplyr::select(n_frvl, total_n_gender) %>% 
    mutate(procentandel = n_frvl/total_n_gender) %>% 
    distinct(koen_new, {{x}}, procentandel,.keep_all = TRUE) %>% 
    filter({{x}} == 1) %>% 
    ungroup() %>% 
    select(koen_new, procentandel) 
}

Which produces what I want:

per.gender(frvlg_1) 

# A tibble: 2 × 2
  koen_new procentandel
  <chr>           <dbl>
1 Kvinde         0.0417
2 Mand           0.115 

However, I also wish to rename the column procentandel to a specific value for each variable that the function is carried out for, namely I want to look up the variable in a codebook which is stored in another tibble, which is shown below:

codebook <- structure(list(Label = c("Frvlg: Kultur (Fx Museer, Lokalhistoriske Arkiver, Sangkor, Teater)", 
"Frvlg: Idræt (Fx Sportsklubber, Danseforeninger, Svømmehaller)", 
"Frvlg: Fritid i Øvrigt (Fx Hobbyforeninger, Slægtsforskning, Spejder)"
), Variable = c("frvlg_1", "frvlg_2", "frvlg_3")), row.names = c(NA, 
-3L), class = c("tbl_df", "tbl", "data.frame"))


# A tibble: 3 × 2
  Label                                                                 Variable
  <chr>                                                                 <chr>   
1 Frvlg: Kultur (Fx Museer, Lokalhistoriske Arkiver, Sangkor, Teater)   frvlg_1 
2 Frvlg: Idræt (Fx Sportsklubber, Danseforeninger, Svømmehaller)        frvlg_2 
3 Frvlg: Fritid i Øvrigt (Fx Hobbyforeninger, Slægtsforskning, Spejder) frvlg_3 

I can look up this value with this, which is the character value I want to rename the column procentandel to:

codebook_e1 %>% filter(Variable == "frvlg_1") %>% select(Label) %>% pull()
[1] "Frvlg: Kultur (Fx Museer, Lokalhistoriske Arkiver, Sangkor, Teater)"

However, I don't know how to refer to x as a character value in the filter verb inside a function in order to refer to the codebook. I have tried various eval functions and such - however, it doesn't seem to work for me in any way.

It works if I add a second argument which is x in quotations marks - however I want only one argument in the function.

I hope this question is clear enough!

CodePudding user response:

Use rlang::ensym() to capture x as a symbol, which you can then convert using as.character():

library(tidyverse)

per.gender <- function(x) {
  new_name <- codebook_e1 %>% 
    filter(Variable == as.character(ensym(x))) %>% 
    select(Label) %>% 
    pull()

  e1_done %>% 
    group_by(koen_new) %>% 
    mutate(total_n_gender = n()) %>% 
    group_by(koen_new,{{x}}) %>% 
    mutate(n_frvl = n()) %>% 
    select(n_frvl, total_n_gender) %>% 
    mutate(procentandel = n_frvl/total_n_gender) %>% 
    distinct(koen_new, {{x}}, procentandel,.keep_all = TRUE) %>% 
    filter({{x}} == 1) %>% 
    ungroup() %>% 
    select(koen_new, !!new_name := procentandel) 
}

per.gender(frvlg_1) 

Result:

# A tibble: 2 x 2
  koen_new `Frvlg: Kultur (Fx Museer, Lokalhistoriske Arkiver, Sangkor, Teater)`
  <chr>                                                                    <dbl>
1 Kvinde                                                                  0.0417
2 Mand 

Also note use of !! and := operators to use the value referred to by new_name in the final select() statement — otherwise the column would just be named "new_name".

  • Related