Home > Mobile >  In R, is there a way to filter based on a variable what has "_" in it's name?
In R, is there a way to filter based on a variable what has "_" in it's name?

Time:04-10

I am working with a large data name "BRFSS2015" and my assignment requires that I find the average "Number of Days Mental Health Not Good" for those in Pennsylvania who have numeric data? I started by using the following code but an error keeps appearing because the "_" in _STATE will not work:

BRFSS2015%>%
  filter(_STATE == "Pennsylvania")

Error: unexpected input in: "BRFSS2015%>% filter(_"

I have also tried backticks and gsub() on it but to no avail. Please help.

CodePudding user response:

The issue is with unusual column name i.e. names starting with digits or punct characters. Use backquote on the column name

library(dplyr)
BRFSS2015%>%
  filter(`_STATE` == "Pennsylvania")

CodePudding user response:

You have to surround the variable name with backticks. Here is a small example:

library(dplyr)

tb = tibble(value = rnorm(5), "_state" = letters[1:5])

tb
#> # A tibble: 5 x 2
#>     value `_state`
#>     <dbl> <chr>   
#> 1 -0.769  a       
#> 2  0.273  b       
#> 3 -1.32   c       
#> 4 -0.0795 d       
#> 5 -0.701  e

# Without backticks
tb %>% 
  filter(_state == "a")

#> Error: unexpected input in:
#> "tb %>% 
#>   filter(_"

# With backticks
tb %>% 
  filter(`_state` == "a")
#> # A tibble: 1 x 2
#>    value `_state`
#>    <dbl> <chr>   
#> 1 -0.769 a

Or avoid this issue by renaming the variable, e.g. with rename.

tb %>% 
  rename(state = `_state`)
#> # A tibble: 5 x 2
#>    value state
#>    <dbl> <chr>
#> 1 -0.573 a    
#> 2 -1.43  b    
#> 3 -1.26  c    
#> 4 -0.313 d    
#> 5  1.06  e

Created on 2022-04-10 by the reprex package (v0.3.0)

  •  Tags:  
  • r
  • Related