Home > Net >  How to accumulate streaks in vector
How to accumulate streaks in vector

Time:10-20

I have a sequence of 0s and 1s in this manner:

xx <- c(1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1)

I want to make a vector that accumulates the streak of the zeroes and adds the accumulated streak to the next possible value 1. The result in this specific vector should be:

yy <- c(1, 1, 1, 0, 0, 3, 0, 2, 0, 0, 0, 4)

What is the fastest and most efficient way to do this in R?

CodePudding user response:

This base R implementation may not be the most efficient implementation so it would be interesting to compare performance if others come up with answers.

code

idx_add <- which(xx == 1 & c(NA, xx[-length(xx)]) == 0)
xx_rle <- rle(xx)
n_add <- xx_rle$lengths[xx_rle$values == 0]
yy <- xx
yy[idx_add] <- yy[idx_add]   n_add

explanation

idx_add <- which(xx == 1 & c(NA, xx[-length(xx)]) == 0)

This line finds the indexes in xx that we will add to. Those are the places where we have a 1 preceded by at least one 0. So we get c(6, 8, 12).

xx_rle <- rle(xx)

Here we use the rle() (run-length encoding) function to get the length of all the streaks of consecutive values in the vector xx. xx_rle has two elements, lengths, the lengths of the streaks; and values, their values (1s and 0s).

n_add <- xx_rle$lengths[xx_rle$values == 0]

Here we extract the streak lengths for only the streaks of zeroes.

yy <- xx
yy[idx_add] <- yy[idx_add]   n_add

Now create a copy of xx and add the zero streak lengths to the first 1 following the streak. This gives your desired result!

CodePudding user response:

One base R solution could be:

with(rle(xx), rep(values   c(0, head(lengths * (values == 0), -1)), lengths))

 [1] 1 1 1 0 0 3 0 2 0 0 0 4

CodePudding user response:

Using dplyr:

Data:

xx <- c(1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1)

Code:

yy <- as.data.frame(xx) %>% 
  mutate(group = ifelse(xx != 0, 1, 0),
         group = cumsum(group)   1,
         group = ifelse(xx != 0, 0, group)) %>% 
  group_by(group) %>% 
  mutate(group = n()   1) %>% 
  ungroup() %>% 
  mutate(yy = ifelse(xx != 0 & lag(xx) == 0, lag(group), xx),
         yy = ifelse(is.na(yy),xx,yy)) %>% 
  select(yy) %>% 
  pull()

Output:

[1] 1 1 1 0 0 3 0 2 0 0 0 4
  • Related