Here's some example code.
df <- structure(list(v1 = c(1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0,
1, 1, 1, 1, 0, 1, 0, 0, 1), v2 = c(1, 0, 1, 1, 0, 1, 0, 1, 0,
1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1), flag = c(NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA)), class = "data.frame", row.names = c(NA, -22L))
I am interested in coding the variable "flag" such that when v1 = 0 and the next v2 = 0, both rows get a 'flag' in the flag column. If a flag has already been placed, it cannot be changed (i.e., row 5 would not be flagged alone, but was already flagged when looking at row 4)
Here is the desired dataframe.
df2 <- structure(list(v1 = c(1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0,
1, 1, 1, 1, 0, 1, 0, 0, 1), v2 = c(1, 0, 1, 1, 0, 1, 0, 1, 0,
1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1), flag = structure(c(NA,
NA, NA, 1L, 1L, 1L, 1L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 1L, 1L, NA), .Label = "flag", class = "factor")), class = "data.frame", row.names = c(NA,
-22L))
I have started with the code below, which matches the conditions I would like, but only changes the row matching the v1 condition, not both.
df2 <- df %>%
mutate( flag = case_when(v1 == 0 & lead(v2)== 0 ~ 'flag'))
This is a very simplified version of my true data and I know there are options other than using case_when
, but I would really like to use the case_when
. function for this (I would also be open to using ifelse
.
CodePudding user response:
library(tidyverse)
df %>%
mutate(f = v1 == 0 & lead(v2) == 0,
flag = ifelse(f|lag(f), 'flag', NA), f = NULL)
v1 v2 flag
1 1 1 <NA>
2 1 0 <NA>
3 0 1 <NA>
4 0 1 flag
5 0 0 flag
6 0 1 flag
7 1 0 flag
8 1 1 <NA>
9 0 0 <NA>
10 1 1 <NA>
11 0 0 <NA>
12 1 1 <NA>
13 0 0 <NA>
14 1 1 <NA>
15 1 0 <NA>
16 1 0 <NA>
17 1 0 <NA>
18 0 1 <NA>
19 1 1 <NA>
20 0 1 flag
21 0 0 flag
22 1 1 <NA>