Effective appending data to existing data.table by joining and coalescing-CodePudding

I have two data.tables: dt_main is the main one, to which I need to append information of only one column from dt_add:

dt_main <- data.table(id    = c(1:5)
                  , name = c("a", "b", NA,NA,NA)
                  , stuff = c(11:15))

dt_add <- data.table(id = c(4:5)
                  , name = c("aaa", "bbb"))

I got the job correctly done by first joining then coalescing:

dt_main_final <- dt_add[dt_main, on = "id"]

dt_main_final[, name := fcoalesce(name, i.name)][, i.name:=NULL]

The provided output is as expected:

   id name stuff
1:  1    a    11
2:  2    b    12
3:  3 <NA>    13
4:  4  aaa    14
5:  5  bbb    15

I wander whether there is a more direct way to have it done, any suggestions? Thanks.

PS> I also tried melting then dcasting:

dt <- dt_add[dt_main, on = "id"]
setnames(dt, "i.name", "name")

dt_melt <- melt(dt
                , measure.vars = patterns("name")
                )

dt_main_final <- dcast(dt_melt
                      , id   stuff ~ variable
                      , fun.aggregate =  fcoalesce                      
                      , value.var = "value")

I got the error:

Error: Aggregating function(s) should take vector inputs and return a single value (length=1). However, function(s) returns length!=1. This value will have to be used to fill any missing combinations, and therefore must be length=1. Either override by setting the 'fill' argument explicitly or modify your function to handle this case appropriately.

Any ideas for this one also?

CodePudding user response：

We can do the join and coalesce in one step:

dt_main[dt_add, name := fcoalesce(name, i.name), on = .(id)]
dt_main
#       id   name stuff
#    <int> <char> <int>
# 1:     1      a    11
# 2:     2      b    12
# 3:     3   <NA>    13
# 4:     4    aaa    14
# 5:     5    bbb    15