I would like to create a binary variable that takes the value 1 for all obs between start = S and end = E and 0 until the next start = S appears and then 1 until end = E and so on (see attached). Is there any function in R that can help me?
CodePudding user response:
How about something like this
library(tidyverse)
df <- tribble(
~x, ~start ,~end,
1,NA,NA,
2,NA,NA,
3,"S",NA,
4,NA,NA,
5,NA,NA,
6,NA,"E",
7,NA,NA,
8,NA,NA,
9,NA,NA,
10,"S",NA,
11,NA,NA,
12,NA,"E")
df %>% mutate(start1 = ifelse(start == "S", 1, 0),
end1 = ifelse(lag(end) == "E", 1, 0) ) %>%
replace_na(list(start1 = 0, end1 = 0)) %>%
mutate(dif = start1 - end1,
indicator = cumsum(dif)) %>%
select(x, start, end, indicator)
Which gives you:
# A tibble: 12 x 4
x start end indicator
<dbl> <chr> <chr> <dbl>
1 1 NA NA 0
2 2 NA NA 0
3 3 S NA 1
4 4 NA NA 1
5 5 NA NA 1
6 6 NA E 1
7 7 NA NA 0
8 8 NA NA 0
9 9 NA NA 0
10 10 S NA 1
11 11 NA NA 1
12 12 NA E 1