Home > database >  Referecing numerical column names as variables in R
Referecing numerical column names as variables in R

Time:11-09

I have a dataframe where columns are numneri

asd <- data.frame(`2021`=rnorm(3), `2`=head(letters,3), check.names=FALSE)

But when I reference the columns names as variable, it is returning error

 x = 2021
asd[x]
Error in `[.data.frame`(asd, x) : undefined columns selected

Expected output

x = 2021
asd[x]
        2021 
1  1.5570860
2 -0.8807877
3 -0.7627930

CodePudding user response:

Reference it as a string:

x = "2021"
asd[,x]
[1] -0.2317928 -0.1895905  1.2514369

CodePudding user response:

Use deparse

asd[,deparse(x)]
[1]  1.3445921 -0.3509493  0.5028844
asd[deparse(x)]
        2021
1  1.3445921
2 -0.3509493
3  0.5028844

CodePudding user response:

A bit more detail: numbers without quotes are not syntactically valid because they are parsed as numbers, so you will not be able to refer to them as column names without including quotes.

You can force R to interpret a number as a column name by asd$2021

> asd$`2021`
[1] -0.634175 -1.612425  1.164135

Generally, you can protect yourself against syntactically invalid column names by

#(in base R)
names(asd) <- make.names(names(asd))
names(asd)
[1] "X2021" "X2"

#(or in tidyverse)
asd <- as_tibble(asd, .name_repair="universal")
New names:
* `2021` -> ...2021
* `2` -> ...2
# A tibble: 3 x 2
  ...2021 ...2
    <dbl> <chr>
1  -0.634 a
2  -1.61  b
3   1.16  c
  •  Tags:  
  • r
  • Related