Make the leading column value NA if condition is met using R-CodePudding

I got a df such as

structure(list(id = c(15305, 15305, 15305, 6224, 6224), transfer = c(0, 
1, 0, 1, 0), hosp = c(2182, 2452, 2846, 1474, 1476), out = c(2183, 
NA, 2857, NA, 1486), Insti = c(NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, 
-5L))

And I want to insert NA in the leading "hosp" column if the lagging "out" and lagging "Insti" columns are NA AND the "transfer" column == 1 I want the df to look like this

structure(list(id2 = c(15305, 15305, 15305, 6224, 6224), transfer2 = c(0, 
1, 0, 1, 0), hosp2 = c(2182, 2452, NA, 1474, NA), out2 = c(2183, 
NA, 2857, NA, 1486), Insti2 = c(NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, 
-5L))

CodePudding user response：

You can use the following solution:

library(dplyr)

df %>%
  mutate(hosp = case_when(
    is.na(lag(out)) & is.na(lag(Insti)) & lag(transfer) == 1 ~ NA_real_,
    TRUE ~ hosp
  ))

     id transfer hosp  out Insti
1 15305        0 2182 2183    NA
2 15305        1 2452   NA    NA
3 15305        0   NA 2857    NA
4  6224        1 1474   NA    NA
5  6224        0   NA 1486    NA

CodePudding user response：

To get the "lag" you may remove last value and add NA as first value. Here a base R solution using ifelse.

transform(df,
          hosp=ifelse(is.na(c(NA, out[-nrow(df)])) & is.na(c(NA, Insti[-nrow(df)])) & 
                        c(NA, Insti[-nrow(df)]) == 1, NA,  hosp))
#      id transfer hosp  out Insti
# 1 15305        0   NA 2183    NA
# 2 15305        1 2452   NA    NA
# 3 15305        0   NA 2857    NA
# 4  6224        1 1474   NA    NA
# 5  6224        0   NA 1486    NA