Home > OS >  cumsum only for certain group ID
cumsum only for certain group ID

Time:06-22

I want to give each run of negative results in a series an ID. The series below is represented by vector let's call it a. The run of negative values where a == 0 should all receive the same ID, i.e. 1. The next time we get a run of negative values I want a new ID to be given i.e. ID == 2. I need to preserve all rows and give zeros to runs of positive results. Please see the example below where a sample series and desired outcome are demonstrated.

data.frame(a=rep(c(1, 0, 1, 0), each=4), ID=rep(c(0, 1, 0, 2), each=4))
       a ID
    1  1  0
    2  1  0
    3  1  0
    4  1  0
    5  0  1
    6  0  1
    7  0  1
    8  0  1
    9  1  0
    10 1  0
    11 1  0
    12 1  0
    13 0  2
    14 0  2
    15 0  2
    16 0  2

CodePudding user response:

Try this

a <- rep(c(1,0,1,0,1 ,0), each=4)
#================================

ID <- c()
r <- rle(a) ; j <- 1L
for(i in seq_along(r$lengths)){
  if(r$values[i] == 1) ID <- c(ID , rep(0 , r$lengths[i]))
  else{
    ID <- c(ID , rep(j , r$lengths[i]))
    j <- j   1L
  }
}
#================================
df <- data.frame(a = a , ID = ID)
df
#>    a ID
#> 1  1  0
#> 2  1  0
#> 3  1  0
#> 4  1  0
#> 5  0  1
#> 6  0  1
#> 7  0  1
#> 8  0  1
#> 9  1  0
#> 10 1  0
#> 11 1  0
#> 12 1  0
#> 13 0  2
#> 14 0  2
#> 15 0  2
#> 16 0  2
#> 17 1  0
#> 18 1  0
#> 19 1  0
#> 20 1  0
#> 21 0  3
#> 22 0  3
#> 23 0  3
#> 24 0  3

Created on 2022-06-22 by the reprex package (v2.0.1)

CodePudding user response:

cumsum where diff is less than zero times negated a.

cumsum(c(x[1] == 0, diff(x) < 0))*!x
# [1] 0 0 0 0 1 1 1 1 0 0 0 0 2 2 2 2

In action:

transform(dat, IDnew=cumsum(c(a[1] == 0, diff(a) < 0))*!a)
#    a ID IDnew
# 1  1  0     0
# 2  1  0     0
# 3  1  0     0
# 4  1  0     0
# 5  0  1     1
# 6  0  1     1
# 7  0  1     1
# 8  0  1     1
# 9  1  0     0
# 10 1  0     0
# 11 1  0     0
# 12 1  0     0
# 13 0  2     2
# 14 0  2     2
# 15 0  2     2
# 16 0  2     2

Also works if x2[1] is zero.

cumsum(c(x2[1] == 0, diff(x2) < 0))*!x2
# [1] 1 0 2 2 2 0 0 0 3 3 3 0 0 0

Data:

dat <- structure(list(a = c(1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 
0, 0), ID = c(0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 2, 2, 2, 2)), class = "data.frame", row.names = c(NA, 
-16L))
x <- dat$a
x2 <- c(0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1)
  • Related