Home > OS >  How can I use dcast based on multiple columns?
How can I use dcast based on multiple columns?

Time:12-04

I have data on this form in a data.table DT:

DT = data.table(
 year=c('1981', '1981', '1981', '2005', '2005', '2005'),
    value=c(2, 8, 16, 3, 9, 27),
    order =c(1,2,3,1,2,3))
year value order
'1981' 2 1
'1981' 8 2
'1981' 16 3
'2005' 3 1
'2005' 9 2
'2005' 27 3

And I want to create new columns based first on the order within a specific year, but then sequentially on the order if I shift it. As you can see value=16 which starts as order=3 on row 1, is logged as order = 2 on row 2, etc.

year order1 order2 order3
'1981' 2 8 16
'1981' 8 16 NA
'1981' 16 NA NA
'2005' 3 9 27
'2005' 9 27 NA
'2005' 27 NA NA

If I wanted it just by order, and get rows 1 and 4 as output, I could do:

dcast(DT, year ~ order, value.var = c('value'))

But how can I cast based on order while incorporating this reordering?

I could perhaps create new columns indicating the new shifted order, using:

DT[,order_2:= c(NA,1,2,NA,1,2)]
DT[,order_3:= c(NA,NA,1,NA,NA,1)]

But then how do I do casting on all three columns? Is there a more elegant way than just casting 3 times and then joining the results?

CodePudding user response:

You don't necessarily need dcast, try this:

DT[, lapply(seq_along(value), \(v) {l <- length(value); `length<-`(value[v:l], l)}), by=year]
#    year V1 V2 V3
# 1: 1981  2  8 16
# 2: 1981  8 16 NA
# 3: 1981 16 NA NA
# 4: 2005  3  9 27
# 5: 2005  9 27 NA
# 6: 2005 27 NA NA

CodePudding user response:

We could use shift with transpose

library(data.table)
DT[, setNames(transpose(shift(value, n = seq_len(.N)-1, type = "lead")), 
      paste0("order", order)), year]

-output

   year order1 order2 order3
1: 1981      2      8     16
2: 1981      8     16     NA
3: 1981     16     NA     NA
4: 2005      3      9     27
5: 2005      9     27     NA
6: 2005     27     NA     NA
  • Related