I have a data frame and would like to add a column with the number of successive TRUE
values using tidyverse.
For example, the following data frame:
dftmp <- data.frame(order = 1:10,
true = c(FALSE, TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, FALSE, TRUE, TRUE))
which looks like
order true
1 1 FALSE
2 2 TRUE
3 3 TRUE
4 4 TRUE
5 5 FALSE
6 6 FALSE
7 7 TRUE
8 8 FALSE
9 9 TRUE
10 10 TRUE
and would like it to look like
order true count
1 1 FALSE 0
2 2 TRUE 1
3 3 TRUE 2
4 4 TRUE 3
5 5 FALSE 0
6 6 FALSE 0
7 7 TRUE 1
8 8 FALSE 0
9 9 TRUE 1
10 10 TRUE 2
I can figure out how to do it in a loop (below), but not sure if there is a tidyverse equivalent (I suspect it will be with dplyr, but not sure)?
dftmp$count <- NA
for (lpVar in 1:nrow(dftmp)) {
dftmp$count[lpVar] <- ifelse(test = dftmp$true[lpVar],
yes = dftmp$count[lpVar - 1] 1,
no = 0)
}
Does anyone have any ideas?
CodePudding user response:
The package hutilscpp
has a convenient cumsum_reset
function:
library(dplyr)
library(hutilscpp)
dftmp %>%
mutate(count = cumsum_reset(true))
order true count
1 1 FALSE 0
2 2 TRUE 1
3 3 TRUE 2
4 4 TRUE 3
5 5 FALSE 0
6 6 FALSE 0
7 7 TRUE 1
8 8 FALSE 0
9 9 TRUE 1
10 10 TRUE 2
Or with dplyr
:
dftmp %>%
group_by(grp = cumsum(!true)) %>%
mutate(cum_sum = cumsum(true)) %>%
ungroup() %>%
select(-grp)
Credits for the second solution: