Home > OS >  column names as variables in dplyr: select v filter
column names as variables in dplyr: select v filter

Time:06-20

I am passing a variable to a function that identifies the column name to filter on. I understand embrace {{}} will resolve variables to actual names

As such, the select statements below work as expected.

select(data,{{var}})

But the filter statements such as do not

filter(data, {{var}} == "STRING")

Reprex:

The example reflects the data I am dealing with where the Column name will appear as values in the column. Note the the last line and the error message that suggests colName does get resolved.

suppressMessages(library(tidyverse))

data <- tribble(
  ~NAME, ~Value,
  "NAME", 1,
  "NOTNAME", 2,
  "NAME", 3,
  "NOTNAME", 4,
  
)

colName = "NAME"

# both give same result
select(data,NAME)
select(data,{{colName}})
select(data,NAME) == select(data,{{colName}})

#these give different results
filter(data,NAME == colName)
filter(data, {{colName}} == colName)
filter(data, {{colName}} == "NAME")

# Error message suggests the {{colName}} gets resolved ok
filter(data,NAME == colName) == filter(data, {{colName}} == colName)

Many thanks

CodePudding user response:

Two potential solutions

  1. Using get
colName = "NAME"
filter(data, get({{colName}}) == colName)
  1. Using rlang::sym
colName = "NAME"
colSym <- rlang::sym(colName)
filter(data, !!colSym == colName)

Not sure which is best.

CodePudding user response:

Clunky but simple workaround could be to make a new variable with known name and compare that:

colName = "NAME"
data %>%
  select(compare_col = {{ colName }}) %>%
  filter(compare_col == "NAME")


# A tibble: 2 × 1
  compare_col
  <chr>  
1 NAME   
2 NAME 
  • Related