how to find where the interval of continuous numbers starts and ends?-CodePudding

I have a vector

vec=c( 2  ,  3  ,  5  ,  6  ,  7 ,   8   ,  16 , 19 , 22 , 23 , 24 )

The continuous numbers are:

c(2 , 3)
c( 5  ,  6  ,  7 ,   8 )
c(22 , 23 , 24)

So the first vector starts at 2 and ends at 3;

for the second vector starts at 5 and ends at 8;

for the third vector starts at 22 and ends at 24;

There is a function to identify where the continuous numbers starts and ends?

CodePudding user response：

By using diff to check the differences between each consecutive value, you can find where the difference is not 1.

diff(vec)
## [1] 1 2 1 1 1 8 3 3 1 1
c(1, diff(vec)) != 1
## [1] FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE

Then use cumsum to make a group identifier:

cumsum(c(1, diff(vec))!=1)
## [1] 0 0 1 1 1 1 2 3 4 4 4

And use this to split your data up:

split(vec, cumsum(c(1, diff(vec))!=1))
##$`0`
##[1] 2 3
##
##$`1`
##[1] 5 6 7 8
##
##$`2`
##[1] 16
##
##$`3`
##[1] 19
##
##$`4`
##[1] 22 23 24

Which can be Filtered to consecutive values:

Filter(\(x) length(x) > 1, split(vec, cumsum(c(1, diff(vec))!=1)))
##$`0`
##[1] 2 3
##
##$`1`
##[1] 5 6 7 8
##
##$`4`
##[1] 22 23 24

CodePudding user response：

Another one

vec=c( 2  ,  3  ,  5  ,  6  ,  7 ,   8   ,  16 , 19 , 22 , 23 , 24 )

x <- replace(NA, vec, vec)
# [1] NA  2  3 NA  5  6  7  8 NA NA NA NA NA NA NA 16 NA NA 19 NA NA 22 23 24

l <- split(x, with(rle(is.na(x)), rep(seq.int(length(lengths)), lengths)))
# l <- split(x, data.table::rleid(is.na(x))) ## same as above
l <- Filter(Negate(anyNA), l)
l
# $`2`
# [1] 2 3
# 
# $`4`
# [1] 5 6 7 8
# 
# $`6`
# [1] 16
# 
# $`8`
# [1] 19
# 
# $`10`
# [1] 22 23 24

If you have a length requirement:

l[lengths(l) > 1]
# $`2`
# [1] 2 3
# 
# $`4`
# [1] 5 6 7 8
# 
# $`10`
# [1] 22 23 24