Home > Blockchain >  Count rows above/below a flag/break
Count rows above/below a flag/break

Time:09-29

I have the following data

library(data.table)
x <- data.table(id = c(rep("123", 6), rep("234", 4)),
                value = c(1,6,5,10,3,2, 4,6,10,5))
x[, break_flag := ifelse(value == 10, 1, 0)]

I want to count the rows above and below the break flag by id to get something like this: enter image description here

I'm using x[, line_count := .N, by = .(id, break_flag)] but its counting all rows by id that are not equal to one.

How can I just count the rows either side of the break (with the break bring one row)?

Thanks

CodePudding user response:

This is how I'd do it in dplyr:

x %>% 
  group_by(id) %>%
  mutate(diff = row_number() - which(break_flag == 1),
         gp = ifelse(diff < 0, "bef", ifelse(diff == 0, "in", "aft"))) %>% 
  group_by(id, gp) %>% 
  mutate(line_count = ifelse(gp == "in", 1, max(abs(diff)))) %>% 
  ungroup() %>% 
  select(-c(diff, gp))

output

   id    value break_flag line_count
 1 123       1          0          3
 2 123       6          0          3
 3 123       5          0          3
 4 123      10          1          1
 5 123       3          0          2
 6 123       2          0          2
 7 234       4          0          2
 8 234       6          0          2
 9 234      10          1          1
10 234       5          0          1
  •  Tags:  
  • r
  • Related