Home > OS >  Correct way to use variable name with subset within purrr::pmap in R?
Correct way to use variable name with subset within purrr::pmap in R?

Time:06-11

I have a tibble called description:

description <- structure(list(col1 = "age", col2 = "> 7 months", col3 = "<= 7 months"), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame"))

> description
# A tibble: 1 × 3
  col1  col2       col3       
  <chr> <chr>      <chr>      
1 age   > 7 months <= 7 months

And a data frame called my_df:

my_df <- structure(list(ID = c("ID1", "ID2", "ID3", "ID4", "ID5", "ID6"
), age = structure(c(1L, 2L, 1L, 2L, 2L, 2L), .Label = c("<= 7 months", 
"> 7 months"), class = "factor")), row.names = c("ID1", "ID2", 
"ID3", "ID4", "ID5", "ID6"), class = "data.frame")

> my_df
     ID         age
ID1 ID1 <= 7 months
ID2 ID2  > 7 months
ID3 ID3 <= 7 months
ID4 ID4  > 7 months
ID5 ID5  > 7 months
ID6 ID6  > 7 months

I currently have the following function:

updated_df <- purrr::pmap(description, function(col1, col2, col3) {
        subset(
            my_df,
            age == col2
        )
})

This produces:

[[1]]
     ID        age
ID2 ID2 > 7 months
ID4 ID4 > 7 months
ID5 ID5 > 7 months
ID6 ID6 > 7 months

But I would like to use the variable col1 instead of age. I have tried the following, but they don't work:

updated_df <- purrr::pmap(description, function(col1, col2, col3) {
        subset(
            my_df,
            col1 == col2
        )
})


updated_df <- purrr::pmap(description, function(col1, col2, col3) {
        subset(
            my_df,
            !!as.name(col1) == col2
        )
})

updated_df <- purrr::pmap(description, function(col1, col2, col3) {
        subset(
            my_df,
            !!col1 == col2
        )
})

Where am I going wrong / what is the correct way to use col1 instead of age?

CodePudding user response:

If you want to stay within subset you'd need to capture and evaluate the full expression. E.g.

purrr::pmap(description, function(col1, col2, col3) {
  subset(
    my_df,
    eval(rlang::expr(!!rlang::ensym(col1) == col2))
  )
})

If you're open for using dplyr::filter instead of subset, you can do:

purrr::pmap(description, function(col1, col2, col3) {
  dplyr::filter(
                my_df,
                !!rlang::ensym(col1) == col2
                )
})
  • Related