I am working with a large data name "BRFSS2015" and my assignment requires that I find the average "Number of Days Mental Health Not Good" for those in Pennsylvania who have numeric data? I started by using the following code but an error keeps appearing because the "_" in _STATE will not work:

  filter(_STATE == "Pennsylvania")

Error: unexpected input in: "BRFSS2015%>% filter(_"

I have also tried backticks and gsub() on it but to no avail. Please help.

The issue is with unusual column name i.e. names starting with digits or punct characters. Use backquote on the column name

  filter(`_STATE` == "Pennsylvania")

You have to surround the variable name with backticks. Here is a small example:


tb = tibble(value = rnorm(5), "_state" = letters[1:5])

#> # A tibble: 5 x 2
#>     value `_state`
#>     <dbl> <chr>   
#> 1 -0.769  a       
#> 2  0.273  b       
#> 3 -1.32   c       
#> 4 -0.0795 d       
#> 5 -0.701  e

# Without backticks
tb %>% 
  filter(_state == "a")

#> Error: unexpected input in:
#> "tb %>% 
#>   filter(_"

# With backticks
tb %>% 
  filter(`_state` == "a")
#> # A tibble: 1 x 2
#>    value `_state`
#>    <dbl> <chr>   
#> 1 -0.769 a

Or avoid this issue by renaming the variable, e.g. with rename.

tb %>% 
  rename(state = `_state`)
#> # A tibble: 5 x 2
#>    value state
#>    <dbl> <chr>
#> 1 -0.573 a    
#> 2 -1.43  b    
#> 3 -1.26  c    
#> 4 -0.313 d    
#> 5  1.06  e

Created on 2022-04-10 by the reprex package (v0.3.0)

