Home > Net >  Numbers as column name in tibbles: problem when using select()
Numbers as column name in tibbles: problem when using select()

Time:09-26

I am trying to select some column by name, and the names are numbers. This is the code:

df2 <- df1 %>% select(`Year`, all_of(append(list1, list2))) %>%

I get this error:

Error: Can't subset columns that don't exist. x Locations 61927, 169014, 75671, 27059, 225963, etc. don't exist. i There are only 5312 columns.

I think the error is due to column names being numbers. How do I solve it? (I want to keep the column name as numbers)

CodePudding user response:

If you insert a number in select it will use as position, buy you can use the number as characters.

Example

library(dplyr)

df <-  tibble(`2020` = NA,`2021` = NA, "var" = NA)

df

# A tibble: 1 x 3
  `2020` `2021` var  
  <lgl>  <lgl>  <lgl>
1 NA     NA     NA 

Using number inside select

I will give an error, since there just 3 variables, and if you use 2020 will search the 2020th column.

df %>% 
  select(2020)

Erro: Can't subset columns that don't exist. x Location 2020 doesn't exist. i There are only 3 columns.

Using number as string inside select

df %>% 
  select("2020")

# A tibble: 1 x 1
  `2020`
  <lgl> 
1 NA  

CodePudding user response:

We may use any_of with paste so that if there are numeric values as column names, it still work and if some of them are missing too, it would not throw an error

library(dplyr)
df1 %>% 
    select(Year, any_of(paste(c(list1, list2))))

CodePudding user response:

You can clean the column names using the janitor package.

df1 <- janitor::clean_names(df1)
  • Related