Home > Software engineering >  how to find where the interval of continuous numbers starts and ends?
how to find where the interval of continuous numbers starts and ends?

Time:03-24

I have a vector

vec=c( 2  ,  3  ,  5  ,  6  ,  7 ,   8   ,  16 , 19 , 22 , 23 , 24 )

The continuous numbers are:

c(2 , 3)
c( 5  ,  6  ,  7 ,   8 )
c(22 , 23 , 24)

So the first vector starts at 2 and ends at 3;

for the second vector starts at 5 and ends at 8;

for the third vector starts at 22 and ends at 24;

There is a function to identify where the continuous numbers starts and ends?

CodePudding user response:

By using diff to check the differences between each consecutive value, you can find where the difference is not 1.

diff(vec)
## [1] 1 2 1 1 1 8 3 3 1 1
c(1, diff(vec)) != 1
## [1] FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE

Then use cumsum to make a group identifier:

cumsum(c(1, diff(vec))!=1)
## [1] 0 0 1 1 1 1 2 3 4 4 4

And use this to split your data up:

split(vec, cumsum(c(1, diff(vec))!=1))
##$`0`
##[1] 2 3
##
##$`1`
##[1] 5 6 7 8
##
##$`2`
##[1] 16
##
##$`3`
##[1] 19
##
##$`4`
##[1] 22 23 24

Which can be Filtered to consecutive values:

Filter(\(x) length(x) > 1, split(vec, cumsum(c(1, diff(vec))!=1)))
##$`0`
##[1] 2 3
##
##$`1`
##[1] 5 6 7 8
##
##$`4`
##[1] 22 23 24

CodePudding user response:

Another one

vec=c( 2  ,  3  ,  5  ,  6  ,  7 ,   8   ,  16 , 19 , 22 , 23 , 24 )

x <- replace(NA, vec, vec)
# [1] NA  2  3 NA  5  6  7  8 NA NA NA NA NA NA NA 16 NA NA 19 NA NA 22 23 24

l <- split(x, with(rle(is.na(x)), rep(seq.int(length(lengths)), lengths)))
# l <- split(x, data.table::rleid(is.na(x))) ## same as above
l <- Filter(Negate(anyNA), l)
l
# $`2`
# [1] 2 3
# 
# $`4`
# [1] 5 6 7 8
# 
# $`6`
# [1] 16
# 
# $`8`
# [1] 19
# 
# $`10`
# [1] 22 23 24

If you have a length requirement:

l[lengths(l) > 1]
# $`2`
# [1] 2 3
# 
# $`4`
# [1] 5 6 7 8
# 
# $`10`
# [1] 22 23 24
  •  Tags:  
  • r
  • Related