I have a table in R (Click on the link below (See my table here)) that shows observations of two events per day: observ1 and observ2. I would like to add a third column to that called 'check'. In column check, I should get a TRUE value if observ1 equals 1 and after 5 to 10 days, observ2 also equals 1.
As you see in the table, check value on row 14 is TRUE. The reason is that observ1 was 1 on row 6 and then after 9 days, observ2 also was 1.
I do not know how to code this in R and get out column 'check'. Appreciate any assistance!
CodePudding user response:
this is not considered a good way to ask a question, generally most posters will use dput()
on their data.frame to provide a sample of their data to upload in the question. The result of this function is copied and pasted from the console in the format I have done below (see data). For future questions it is considered good practice. At any rate hope this solutions helps:
Base R solution:
df1$check <- with(
df1,
vapply(
seq_along(observ2),
function(i){
if(i - 5 <= 0){
NA
}else{
ir <- max(i-10, 1)
ir2 <- (any(observ1[ir:(i-5)] == 1) & observ2[i] == 1)
ifelse(ir2, ir2, NA)
}
},
logical(1)
)
)
Data:
df1 <- structure(list(day = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20), observ1 = c(1, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), observ2 = c(0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -20L))