I have two data.tables: dt_main is the main one, to which I need to append information of only one column from dt_add:
dt_main <- data.table(id = c(1:5)
, name = c("a", "b", NA,NA,NA)
, stuff = c(11:15))
dt_add <- data.table(id = c(4:5)
, name = c("aaa", "bbb"))
I got the job correctly done by first joining then coalescing:
dt_main_final <- dt_add[dt_main, on = "id"]
dt_main_final[, name := fcoalesce(name, i.name)][, i.name:=NULL]
The provided output is as expected:
id name stuff
1: 1 a 11
2: 2 b 12
3: 3 <NA> 13
4: 4 aaa 14
5: 5 bbb 15
I wander whether there is a more direct way to have it done, any suggestions? Thanks.
PS> I also tried melting then dcasting:
dt <- dt_add[dt_main, on = "id"]
setnames(dt, "i.name", "name")
dt_melt <- melt(dt
, measure.vars = patterns("name")
)
dt_main_final <- dcast(dt_melt
, id stuff ~ variable
, fun.aggregate = fcoalesce
, value.var = "value")
I got the error:
Error: Aggregating function(s) should take vector inputs and return a single value (length=1). However, function(s) returns length!=1. This value will have to be used to fill any missing combinations, and therefore must be length=1. Either override by setting the 'fill' argument explicitly or modify your function to handle this case appropriately.
Any ideas for this one also?
CodePudding user response:
We can do the join and coalesce in one step:
dt_main[dt_add, name := fcoalesce(name, i.name), on = .(id)]
dt_main
# id name stuff
# <int> <char> <int>
# 1: 1 a 11
# 2: 2 b 12
# 3: 3 <NA> 13
# 4: 4 aaa 14
# 5: 5 bbb 15