I have some data, like so
library(data.table)
data1 <- data.table(
id = c(seq(1:5)),
date = c(as.Date("2021-05-05"),as.Date("2021-05-04"),as.Date("2021-05-03"),as.Date("2021-05-02"),as.Date("2021-05-01")),
var = c(rep("var1",5))
)
data2 <- data.table(
id = c(seq(1:5)),
date = c(as.Date("2021-05-05"),as.Date("2020-05-04"),as.Date("2020-05-03"),as.Date("2020-05-02"),as.Date("2020-05-01")),
var = c(rep("var2",5))
)
data3 <- data.table(
id = c(seq(1:5)),
date = c(as.Date("2021-05-05"),as.Date("2020-07-04"),as.Date("2020-07-03"),as.Date("2020-07-02"),as.Date("2020-07-01")),
var = c(rep("var3",5))
)
data4 <- data.table(
id = c(seq(1:5)),
date = c(as.Date("2021-05-05"),as.Date("2020-07-04"),as.Date("2020-07-03"),as.Date("2020-07-02"),as.Date("2020-07-01")),
var = c(rep("var4",5))
)
data <- rbind(data1,data2,data3,data4)
data <- data[order(id)]
I would like to find and keep those unique IDs (including keeping the dates), where date for var1, var2, and var3 is equal to one another. I dont care about var4. In the above data, only id 1 would match these criteria, so I want to end up with something like this:
data_final <- data.table(
id = 1,
date_var1 = as.Date("2021-05-05"),
date_var2 = as.Date("2021-05-05"),
date_var3 = as.Date("2021-05-05")
)
CodePudding user response:
Use dcast
to reshape to wide format after filtering - grouped by 'id', filter the 'id's having all values in 'date' are same. Then do the dcast
on the filtered data to reshape back to wide format
data1 <- data[data[, .I[uniqueN(date[var %in% paste0("var", 1:3)]) == 1],id]$V1
][var != 'var4']
out1 <- dcast(data1, id ~ paste0('date_', var), value.var = 'date')
-output
> out1
id date_var1 date_var2 date_var3
1: 1 2021-05-05 2021-05-05 2021-05-05