This is somewhat similar to Like rleid but ignoring NAs, but I want NAs "ignored" in the counter (i.e., if we have NA, use NA in the counter). I need to initialize a counter that starts at 1 to count the occurrence of a number, keeps the previous counter if I have the same number as above, and restarts counter at 1 after any NA occurrence.
I have this:
# have
months <- c(1, 8, 1, 1, 1, NA, NA, 2, 6, NA)
# want
months_counter <- c(1, 2, 3, 3, 3, NA, NA, 1, 2, NA)
I have tried different ways using rleid
but all of them seem to not have the functionality of ignoring NAs as above. Something to be applied in a data.table
would be even more appreciated!
CodePudding user response:
We can add a column counting NA
s to use as a grouping column for rleid
but only set the values on rows where months
is not NA
:
library(data.table)
dt = data.table(months = c(1, 8, 1, 1, 1, NA, NA, 2, 6, NA))
dt[, grouper := cumsum(is.na(months))][
!is.na(months),
result := rleid(months),
by = grouper
]
dt
# months grouper result
# 1: 1 0 1
# 2: 8 0 2
# 3: 1 0 3
# 4: 1 0 3
# 5: 1 0 3
# 6: NA 1 NA
# 7: NA 2 NA
# 8: 2 2 1
# 9: 6 2 2
# 10: NA 3 NA