Home > other >  count events based on transitions between two variables R
count events based on transitions between two variables R

Time:03-09

I am trying to create an event counter that counts the number of events based on transitions in two variables, var1 and var2. In the hypothetical data frame the expected behaviour is in the event column.

dt <- data.frame(var1 = c(-1, 1),
                   var2 = c('A', 'B', 'A', 'B'),
                   event = c(1, 1, 2, 2))

In the above dataframe using match, unique and group_by does not give the desired behaviour.

dt %>% 
    group_by(var2) %>% 
    mutate(event2 = match(var1, unique(var1))) 

However when I create a data frame with 3 instances of var1 and var2

dt2 <- data.frame(var1 = c(rep(-1, 3), rep(1, 3), rep(1, 3), rep(-1, 3)),
                   var2 = c(rep('A', 3), rep('B', 3), rep('A', 3), rep('B', 3)),
                   event = c(rep(1, 6), rep(2, 6)))

Using match, unique and group_by reproduces the desired behaviour. What its the reason for this and is there a way to create an event counter (or id) that identifies unique instances of 1 and -1 in var1 along with unique instances of A and B in var2 and increments an event counter irrespective of the number of values of 1, -1 and A, B in var1 and var2 respectively.

CodePudding user response:

You can use cumsum and lag.

library(dplyr)
dt2 %>% 
  mutate(event = cumsum(var2 == "A" & lag(var2, default = "B") != "A"))

   var1 var2 event
1    -1    A     1
2    -1    A     1
3    -1    A     1
4     1    B     1
5     1    B     1
6     1    B     1
7     1    A     2
8     1    A     2
9     1    A     2
10   -1    B     2
11   -1    B     2
12   -1    B     2
  • Related