Data frames and data.tables behave differently when selecting columns using a variable with column names. Is there a single expression that will work for both types of data structures? Why do I care? I have some user-defined functions originally written for data frames, and would like them to work for both frames and tables.
Here is an example. Make a data frame and a data.table.
> fr <- data.frame(a = 1, b = 1)
> tb <- data.table::data.table(a = 1, b = 1)
Here is a vector with a column name.
> v <- 'a'
It can be used to extract a column from the data frame like this:
> fr[, v]
[1] 1
But data.tables require something else.
> tb[, v]
Error in `[.data.table`(tb, , v) :
j (the 2nd argument inside [...]) is a single symbol but column name 'v' is not found. Perhaps you
intended DT[, ..v]. This difference to data.frame is deliberate and explained in FAQ 1.1.
> tb[, ..v]
a
1: 1
> tb[, v, with = FALSE]
a
1: 1
Neither of these options that work with data tables will work with data frames.
> fr[, ..v]
Error in `[.data.frame`(fr, , ..v) : object '..v' not found
> fr[, v, with = FALSE]
Error in `[.data.frame`(fr, , v, with = FALSE) :
unused argument (with = FALSE)
Is there an approach that works for both data frames and data.tables?
I know I can use this list-style indexing for both:
> fr[[v]]
[1] 1
> tb[[v]]
[1] 1
But that only works if I don't need to include a row index as well.
From FAQ 1.5 I would think I could change an option to get the desired behavior. This could be a solution, but I don't see what I would expect.
> options(datatable.WhenJisSymbolThenCallingScope=TRUE)
> options()$datatable.WhenJisSymbolThenCallingScope
[1] TRUE
> tb[, v]
Error in `[.data.table`(tb, , v) :
j (the 2nd argument inside [...]) is a single symbol but column name 'v' is not found. Perhaps you
intended DT[, ..v]. This difference to data.frame is deliberate and explained in FAQ 1.1.
Am I confused?
> packageVersion('data.table')
[1] ‘1.14.2’
CodePudding user response:
fr <- data.frame(a = 1, b = 1)
tb <- data.table::data.table(a = 1, b = 1)
v<-'a'
subset(fr, , get(v))
#> a
#> 1 1
subset(tb, , get(v))
#> a
#> 1: 1
CodePudding user response:
.subset2
accepts a string:
.subset2(tb, v)
.subset2(fr, v)