I got a df such as
structure(list(id = c(15305, 15305, 15305, 6224, 6224), transfer = c(0,
1, 0, 1, 0), hosp = c(2182, 2452, 2846, 1474, 1476), out = c(2183,
NA, 2857, NA, 1486), Insti = c(NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA,
-5L))
And I want to insert NA in the leading "hosp" column if the lagging "out" and lagging "Insti" columns are NA AND the "transfer" column == 1 I want the df to look like this
structure(list(id2 = c(15305, 15305, 15305, 6224, 6224), transfer2 = c(0,
1, 0, 1, 0), hosp2 = c(2182, 2452, NA, 1474, NA), out2 = c(2183,
NA, 2857, NA, 1486), Insti2 = c(NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA,
-5L))
CodePudding user response:
You can use the following solution:
library(dplyr)
df %>%
mutate(hosp = case_when(
is.na(lag(out)) & is.na(lag(Insti)) & lag(transfer) == 1 ~ NA_real_,
TRUE ~ hosp
))
id transfer hosp out Insti
1 15305 0 2182 2183 NA
2 15305 1 2452 NA NA
3 15305 0 NA 2857 NA
4 6224 1 1474 NA NA
5 6224 0 NA 1486 NA
CodePudding user response:
To get the "lag" you may remove last value and add NA
as first value. Here a base R solution using ifelse
.
transform(df,
hosp=ifelse(is.na(c(NA, out[-nrow(df)])) & is.na(c(NA, Insti[-nrow(df)])) &
c(NA, Insti[-nrow(df)]) == 1, NA, hosp))
# id transfer hosp out Insti
# 1 15305 0 NA 2183 NA
# 2 15305 1 2452 NA NA
# 3 15305 0 NA 2857 NA
# 4 6224 1 1474 NA NA
# 5 6224 0 NA 1486 NA