Home > OS >  include list elements in column R
include list elements in column R

Time:03-07

I would like to add the list elements into a column by group. A reproducible example below. Do you know how to achieve this?

library(data.table)
a <- c("A","A","A","A","B","B","C","C","C","D") 
b <- seq(1,10)

dt <- data.table(a,b)

list <- c(15,10,9,120)

dt <- data.table(a,b, c(15,15,15,15,10,10,9,9,9,120))
View(dt)

CodePudding user response:

One way would be using match -

library(data.table)

dt[, new := list[match(a, unique(a))]]
dt

#    a  b new
# 1: A  1  15
# 2: A  2  15
# 3: A  3  15
# 4: A  4  15
# 5: B  5  10
# 6: B  6  10
# 7: C  7   9
# 8: C  8   9
# 9: C  9   9
#10: D 10 120

CodePudding user response:

Another way, using rep and rle:

library(data.table)
a <- c("A","A","A","A","B","B","C","C","C","D") 
b <- seq(1,10)
dt <- data.table(a,b)
list <- c(15,10,9,120)

dt[, c := rep(list, rle(dt$a)$l)]

#    a  b   c
# 1: A  1  15
# 2: A  2  15
# 3: A  3  15
# 4: A  4  15
# 5: B  5  10
# 6: B  6  10
# 7: C  7   9
# 8: C  8   9
# 9: C  9   9
#10: D 10 120

CodePudding user response:

Maybe this one?

> dt[, new := list[.GRP],a][]
    a  b new
 1: A  1  15
 2: A  2  15
 3: A  3  15
 4: A  4  15
 5: B  5  10
 6: B  6  10
 7: C  7   9
 8: C  8   9
 9: C  9   9
10: D 10 120

CodePudding user response:

With rleid:

dt[,new:=list[rleid(a)]][]

         a     b    V3   new
    <char> <int> <num> <num>
 1:      A     1    15    15
 2:      A     2    15    15
 3:      A     3    15    15
 4:      A     4    15    15
 5:      B     5    10    10
 6:      B     6    10    10
 7:      C     7     9     9
 8:      C     8     9     9
 9:      C     9     9     9
10:      D    10   120   120

Note that list is a reserved keyword, try not to use it as a variable to avoid confusion.

Performance comparison:

microbenchmark::microbenchmark(dt[, new := list[match(a, unique(a))]],dt[,new:=l[rleid(a)]],dt[,new:=l[.GRP],a],dt[, c := rep(list, rle(dt$a)$l)])
Unit: microseconds
                                       expr   min     lq    mean median     uq    max neval
 dt[, `:=`(new, list[match(a, unique(a))])] 186.0 200.65 269.513 223.90 298.15  700.3   100
               dt[, `:=`(new, l[rleid(a)])] 184.9 202.00 248.979 221.85 267.50  650.4   100
                dt[, `:=`(new, l[.GRP]), a] 347.4 365.70 466.369 396.35 540.40 1186.4   100
      dt[, `:=`(c, rep(list, rle(dt$a)$l))] 204.8 223.85 300.685 257.50 326.95  903.5   100
  • Related