Consider the standard data.table syntax DT[i, j, ...]
. Since .SD
is only defined in j
and NULL
in i
, is there any way to implicitly (desired) or explicitly (via something like .SD
) refer to the current data.table in a function in i
?
Use Case
I would like to write a function that filters standard columns. The column names are the same across multiple tables and somewhat verbose. To speed up my coding by less typing, I would like to write a function like this:
library(data.table)
dt <- data.table(postal_code = c("USA123", "SPEEDO", "USA421"),
customer_name = c("Taylor", "Walker", "Thompson"))
dt
#> postal_code customer_name
#> 1: USA123 Taylor
#> 2: SPEEDO Walker
#> 3: USA421 Thompson
# Filter all customers from a common postal code
# that surname starts with specific letters
extract <- function(x, y, DT) {
DT[, startsWith(postal_code, x) & startsWith(customer_name, y)]
}
# does not work
dt[extract("USA", "T", .SD)]
#> Error in .checkTypos(e, names_x): Object 'postal_code' not found.
#> Perhaps you intended postal_code
# works but requires specifying the data.table explicitly
# plus the drawback that it cannot be called upon, e.g. a grouped .SD
# in a nested call
dt[extract("USA", "T", dt)]
#> postal_code customer_name
#> 1: USA123 Taylor
#> 2: USA421 Thompson
Desired (pseudo code)
dt[extract("USA", "T")]
#> postal_code customer_name
#> 1: USA123 Taylor
#> 2: USA421 Thompson
# but also
# subsequent steps in j
dt[extract("USA", "T"), relevant := TRUE][]
#> postal_code customer_name relevant
#> 1: USA123 Taylor TRUE
#> 2: SPEEDO Walker NA
#> 3: USA421 Thompson TRUE
# using other data.tables
another_dt[extract("USA", "T")]
yet_another_dt[extract("USA", "T")]
CodePudding user response:
I'm not a data.table
expert but you can try the following workaround
> dt[,.SD[extract("USA", "T", .SD)]]
postal_code customer_name
1: USA123 Taylor
2: USA421 Thompson
where you play self-reference at j
within .SD
CodePudding user response:
Here is a possible approach...
#create named vector
mystr <- c(postal_code = "USA", customer_name = "T")
#build query text
query <- paste0("grepl(\"^", mystr, "\", ", names(mystr), ")", collapse = " & ")
#eval/parse dynamic text
dt[eval(parse(text = query)), ]
# postal_code customer_name
# 1: USA123 Taylor
# 2: USA421 Thompson