Correct way to use variable name with subset within purrr::pmap in R?-CodePudding

I have a tibble called description:

description <- structure(list(col1 = "age", col2 = "> 7 months", col3 = "<= 7 months"), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame"))

> description
# A tibble: 1 × 3
  col1  col2       col3       
  <chr> <chr>      <chr>      
1 age   > 7 months <= 7 months

And a data frame called my_df:

my_df <- structure(list(ID = c("ID1", "ID2", "ID3", "ID4", "ID5", "ID6"
), age = structure(c(1L, 2L, 1L, 2L, 2L, 2L), .Label = c("<= 7 months", 
"> 7 months"), class = "factor")), row.names = c("ID1", "ID2", 
"ID3", "ID4", "ID5", "ID6"), class = "data.frame")

> my_df
     ID         age
ID1 ID1 <= 7 months
ID2 ID2  > 7 months
ID3 ID3 <= 7 months
ID4 ID4  > 7 months
ID5 ID5  > 7 months
ID6 ID6  > 7 months

I currently have the following function:

updated_df <- purrr::pmap(description, function(col1, col2, col3) {
        subset(
            my_df,
            age == col2
        )
})

This produces:

[[1]]
     ID        age
ID2 ID2 > 7 months
ID4 ID4 > 7 months
ID5 ID5 > 7 months
ID6 ID6 > 7 months

But I would like to use the variable col1 instead of age. I have tried the following, but they don't work:

updated_df <- purrr::pmap(description, function(col1, col2, col3) {
        subset(
            my_df,
            col1 == col2
        )
})


updated_df <- purrr::pmap(description, function(col1, col2, col3) {
        subset(
            my_df,
            !!as.name(col1) == col2
        )
})

updated_df <- purrr::pmap(description, function(col1, col2, col3) {
        subset(
            my_df,
            !!col1 == col2
        )
})

Where am I going wrong / what is the correct way to use col1 instead of age?

CodePudding user response：

If you want to stay within subset you'd need to capture and evaluate the full expression. E.g.

purrr::pmap(description, function(col1, col2, col3) {
  subset(
    my_df,
    eval(rlang::expr(!!rlang::ensym(col1) == col2))
  )
})

If you're open for using dplyr::filter instead of subset, you can do:

purrr::pmap(description, function(col1, col2, col3) {
  dplyr::filter(
                my_df,
                !!rlang::ensym(col1) == col2
                )
})