I have a table like follows:
library(data.table)
dt <- data.table(t0.id=1:2,t0.V1=c("a","e"),t0.V2=c("b","f"),t1.id=3:4,t1.V1=c("c","g"),t1.V2=c("d","h"))
dt
t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
1: 1 a b 3 c d
2: 2 e f 4 g h
and I want to copy data from row one to a new row, like this:
t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
1: 1 a b
2: 3 c d
3: 2 e f 4 g h
I know how to duplicate the row (I've seen this post), but I don't know how to clear columns by condition (eg. t0.id==1) since both rows would be equal.
I guess it could be done by row index, but my real table has thousands of rows and I don't think that's the best way to do it.
Thank you
EDIT:
- The final order of rows doesn't matter, that is, final rows no 1 and 2 don't need to be next to each other.
- I 'manually' (by looking at some variables) identify which rows need to be split. So, the only condition to be applied is based on 't0.id'.
CodePudding user response:
library(data.table)
splitids <- 1L # t0.id
out <- rbindlist(list(
dt[t0.id %in% splitids, .SD, .SDcols = patterns("^t0")],
dt[t0.id %in% splitids, .SD, .SDcols = patterns("^t1")],
dt[!t0.id %in% splitids,]),
use.names = TRUE, fill = TRUE)
out
# t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
# <int> <char> <char> <int> <char> <char>
# 1: 1 a b NA <NA> <NA>
# 2: NA <NA> <NA> 3 c d
# 3: 2 e f 4 g h
It may make more sense if you look at each of them:
dt[t0.id %in% splitids, .SD, .SDcols = patterns("^t0")]
# t0.id t0.V1 t0.V2
# <int> <char> <char>
# 1: 1 a b
dt[t0.id %in% splitids, .SD, .SDcols = patterns("^t1")]
# t1.id t1.V1 t1.V2
# <int> <char> <char>
# 1: 3 c d
dt[!t0.id %in% splitids,]
# t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
# <int> <char> <char> <int> <char> <char>
# 1: 2 e f 4 g h
If you need the blank ""
instead of NA
, then that can be done to your character
columns, but not to the t*.id
columns since that would convert them from integer
to character
.
ischr <- which(sapply(dt, inherits, "character"))
ischr
# t0.V1 t0.V2 t1.V1 t1.V2
# 2 3 5 6
out[, (ischr) := lapply(.SD, fcoalesce, ""), .SDcols = ischr][]
# t0.id t0.V1 t0.V2 t1.id t1.V1 t1.V2
# <int> <char> <char> <int> <char> <char>
# 1: 1 a b NA
# 2: NA 3 c d
# 3: 2 e f 4 g h