Get elements randomly retrieve from another table B with data table-CodePudding

I have a table A of data.table type which contains a set of elements, I would like to add a new mass column to this table, randomly retrieve from another table B. I would like to do this with the data.table package, but I don't know how to optimze it.

Table A

element
----------
silver
chlorine
silver
chlorine
chlorine
chlorine
silver

Table B

mass    element
------ 
0.3     silver
0.5     silver
1.6     silver
1.2     chlorine
5.3     chlorine
0.1     chlorine

Code to build tables:

tableA <- data.table(
  element = c("silver","chlorine","silver", "chlorine","chlorine","chlorine","silver")
)

tableB <- data.table(

  mass = c(0.3,0.5,1.6,1.2,5.3,0.1),
  element = c("silver","silver","silver", "chlorine","chlorine","chlorine")
)

The volume of data being very important I would like to use on the data.table package

Expected result:

    element     mass
-----------------------
    silver      1.6
    chlorine    5.3
    silver      0.3
    chlorine    1.2
    chlorine    1.2
    chlorine    0.1
    silver      1.6

This code returns me an error that can't fix, but do you think the approach is good and optimized?

tableA[, mass := tableB[sample(mass), on = .(element)]$mass]

CodePudding user response：

Here's one way,

set.seed(42)
tableA[tableB[, list(mass = list(mass)), by = element], mass := sapply(i.mass, sample, size = 1), on = .(element)]
tableA
#     element  mass
#      <char> <num>
# 1:   silver   0.3
# 2: chlorine   1.2
# 3:   silver   0.3
# 4: chlorine   5.3
# 5: chlorine   5.3
# 6: chlorine   5.3
# 7:   silver   0.3

CodePudding user response：

Here is an alternative data.table approach, if you are interested:

tableA[, mass:=sample(tableB[element==.BY$element,mass], .N, replace=T), by=element]

Output:

    element mass
1:   silver  0.3
2: chlorine  1.2
3:   silver  1.6
4: chlorine  5.3
5: chlorine  5.3
6: chlorine  5.3
7:   silver  0.5