I have a Dataframe which looks similar to this:
set.seed(42)
start <- Sys.Date() sort(sample(1:10, 5))
set.seed(43)
end <- Sys.Date() sort(sample(1:10, 5))
end[4] <- NA
A <- c("10", "15", "NA", "4", "NA")
B <- rpois(n = 5, lambda = 10)
df <- data.frame(start, end, A, B)
I would like , when there is an NA in the column A to caclulate the hours beweet start and end. Nothing shall happen when either start or end is NA.
I tried somthing like that:
df[, df$A [is.na(df[, df$A])]] <- difftime(df$end, df$start, units = "hours")
but this gives me the Error: undefined columns selected.
Does someone have an Idea? Thanks.
CodePudding user response:
Create an index where there are NA
in 'A' column, subset the 'start', 'end' based on the index, get the difftime
and assign back
df$A <- as.numeric(df$A)
i1 <- is.na(df$A)
df$A[i1] <- with(df, as.numeric(difftime(start[i1], end[i1], units = "hours")))