I have the following data
library(data.table)
x <- data.table(id = c(rep("123", 6), rep("234", 4)),
value = c(1,6,5,10,3,2, 4,6,10,5))
x[, break_flag := ifelse(value == 10, 1, 0)]
I want to count the rows above and below the break flag by id to get something like this:
I'm using x[, line_count := .N, by = .(id, break_flag)]
but its counting all rows by id that are not equal to one.
How can I just count the rows either side of the break (with the break bring one row)?
Thanks
CodePudding user response:
This is how I'd do it in dplyr
:
x %>%
group_by(id) %>%
mutate(diff = row_number() - which(break_flag == 1),
gp = ifelse(diff < 0, "bef", ifelse(diff == 0, "in", "aft"))) %>%
group_by(id, gp) %>%
mutate(line_count = ifelse(gp == "in", 1, max(abs(diff)))) %>%
ungroup() %>%
select(-c(diff, gp))
output
id value break_flag line_count
1 123 1 0 3
2 123 6 0 3
3 123 5 0 3
4 123 10 1 1
5 123 3 0 2
6 123 2 0 2
7 234 4 0 2
8 234 6 0 2
9 234 10 1 1
10 234 5 0 1