Home > Enterprise >  How to find the streaks of a particular value in R?
How to find the streaks of a particular value in R?

Time:08-22

The rle() function returns a list with values and lengths. I have not found a way to subset the output to isolate the streaks of a particular value that does not involve calling rle() twice, or saving the output into an object to later subset (an added step).

For instance, for runs of heads (1's) in a series of fair coin tosses:

s <- sample(c(0,1),100,T)
rle(s)
Run Length Encoding
  lengths: int [1:55] 1 2 1 2 1 2 1 2 2 1 ...
  values : num [1:55] 0 1 0 1 0 1 0 1 0 1 ...

# Double-call:

rle(s)[[1]][rle(s)[[2]]==1]
 [1] 2 2 2 2 1 1 1 1 6 1 1 1 2 2 1 1 2 2 2 2 2 3 1 1 4 1 2

# Adding an intermediate step:

> r <- rle(s)
> r$lengths[r$values==1]
 [1] 2 2 2 2 1 1 1 1 6 1 1 1 2 2 1 1 2 2 2 2 2 3 1 1 4 1 2

I see that a very easy way of getting the streak lengths just for 1 is to simply tweak the rle() code (answer), but there may be an even simpler way.

CodePudding user response:

in Base R:

with(rle(s), lengths[values==1])

 [1] 1 3 2 2 1 1 1 3 2 1 1 3 1 1 1 1 1 2 3 1 2 1 3 3 1 2 1 1 2

CodePudding user response:

For a sequence of outcomes s and when interested solely the lengths of the streaks on outcome oc:

sk = function(s,oc){
  n = length(s)
  y <- s[-1L] != s[-n]
  i <- c(which(y), n)
  diff(c(0L, i))[s[i]==oc]
}

So to get the lengths for 1:

sk(s,1)
 [1] 2 2 2 2 1 1 1 1 6 1 1 1 2 2 1 1 2 2 2 2 2 3 1 1 4 1 2

and likewise for 0:

sk(s,0)
 [1] 1 1 1 1 2 2 2 2 4 1 1 2 1 1 1 1 1 1 3 1 1 2 6 2 1 1 4 4
  •  Tags:  
  • r
  • Related