So I have a vector of 1's and 0's and I want to identify different "groups" of repeating 1's.
What I want:
v1 group
1 a
1 a
1 a
0 NA
0 NA
1 b
1 b
0 NA
1 c
What's the best way to do this, preferably in base R?
CodePudding user response:
Using diff
and cumsum
.
letters[cumsum((c(0, diff(x)) > 0)) 1] |>
replace(x == 0, NA)
# [1] "a" "a" "a" NA NA "b" "b" NA "c"
Data:
x <- c(1, 1, 1, 0, 0, 1, 1, 0, 1)
CodePudding user response:
There is a convenient function in data.table
for this, i.e.
replace(data.table::rleid(df$v1), df$v1 == 0, NA)
#[1] 1 1 1 NA NA 3 3 NA 5
If you want the letters, suggestion from @sindri_baldur works UNTIL THE ALPHABET RUNS OUT, i.e.
letters[data.table::frank(replace(data.table::rleid(df$v1), df$v1 == 0, NA_real_), ties.method = 'dense', na.last = 'keep')]
#[1] "a" "a" "a" NA NA "b" "b" NA "c"
CodePudding user response:
Using rle
:
v1 <- c(1,1,1,0,0,1,1,0,1,0,1,1,0,1,1,1,1,1)
d <- data.frame(v1)
d[d$v1 == 1, "group"] <- letters[with(rle(d$v1), rep(cumsum(values[values == 1]), lengths[values == 1]))]
v1 group
1 1 a
2 1 a
3 1 a
4 0 <NA>
5 0 <NA>
6 1 b
7 1 b
8 0 <NA>
9 1 c
10 0 <NA>
11 1 d
12 1 d
13 0 <NA>
14 1 e
15 1 e
16 1 e
17 1 e
18 1 e
CodePudding user response:
We can try
with(
rle(v1),
letters[replace(
rep(values * cumsum(values), lengths),
v1 == 0,
NA
)]
)
CodePudding user response:
One way might be to use rle
.
. <- rle(v1)
i <- .$values == 1
.$values[i] <- letters[factor(.$lengths[i])]
.$values[!i] <- NA
data.frame(v1, group=inverse.rle(.))
# v1 group
#1 1 c
#2 1 c
#3 1 c
#4 0 <NA>
#5 0 <NA>
#6 1 b
#7 1 b
#8 0 <NA>
#9 1 a
#10 0 <NA>
#11 1 b
#12 1 b
#13 0 <NA>
#14 1 d
#15 1 d
#16 1 d
#17 1 d
#18 1 d
Data:
v1 <- c(1,1,1,0,0,1,1,0,1,0,1,1,0,1,1,1,1,1)