Home > Enterprise >  Within-group index (irregular groups)
Within-group index (irregular groups)

Time:07-30

I have some data (irregular group labels) like g, and I want to obtain k -- i.e. within-group indices, via resetting cumsum.

g = c(1,1,1, 2, 3,3, 4, 5, 6,6,6,6,6, 7, 8, 9,9,9,9, 10, 11, 12, 13,13)
k = c(1,2,3, 1, 1,2, 1, 1, 1,2,3,4,5, 1, 1, 1,2,3,4,  1,  1,  1,  1, 2)

I have a working solution:

g.index = function(g){
  rep.i = c(F,diff(g)==0)
  k = numeric(length(g))
  for (i in 1:length(g)){
    if (rep.i[i]){ cs = cs   1 } else { cs = 1 }
    k[i] = cs
  }
  return(k)
}

But I'm worried it will be slow due to loops versus vectorization. Is there a more efficient way?

CodePudding user response:

As commented by @akrun, use data.table::rowid

g = c(1,1,1, 2, 3,3, 4, 5, 6,6,6,6,6, 7, 8, 9,9,9,9, 10, 11, 12, 13,13)
k = c(1,2,3, 1, 1,2, 1, 1, 1,2,3,4,5, 1, 1, 1,2,3,4,  1,  1,  1,  1, 2)

library(data.table)

all(rowid(g) == k)
#> [1] TRUE

Created on 2022-07-29 by the plot

  • Related