My objective is to do a cumulative sum of the elements of a vector and assign the result to each element. But when certain condition is reached, then reset the cumulative sum.
For example:
vector_A <- c(1, 1, -1, -1, -1, 1, -1, -1, 1, -1)
Now, suppose that the condition to reset the cumulative sum is that the next element has a different sign.
Then the desired output is:
vector_B <- c(1, 2, -1, -2, -3, 1, -1, -2, 1, -1)
How can I achieve this?
CodePudding user response:
A base R option with Reduce
> Reduce(function(x, y) ifelse(x * y > 0, x y, y), vector_A, accumulate = TRUE)
[1] 1 2 -1 -2 -3 1 -1 -2 1 -1
or using ave
cumsum
> ave(vector_A, cumsum(c(1, diff(sign(vector_A)) != 0)), FUN = cumsum)
[1] 1 2 -1 -2 -3 1 -1 -2 1 -1
CodePudding user response:
Using ave
:
ave(vector_A, data.table::rleid(sign(A)), FUN = cumsum)
# [1] 1 2 -1 -2 -3 1 -1 -2 1 -1
A formula version of accumulate
:
purrr::accumulate(vector_A, ~ ifelse(sign(.x) == sign(.y), .x .y, .y))
# [1] 1 2 -1 -2 -3 1 -1 -2 1 -1
CodePudding user response:
You can use a custom function instead of cumsum
and accumulate results using e.g. purrr::accumulate
:
library(purrr)
vector_A <- c(1, 1, -1, -1, -1, 1, -1, -1, 1, -1)
purrr::accumulate(vector_A, function(a,b) {
if (sign(a) == sign(b))
a b
else
b
})
[1] 1 2 -1 -2 -3 1 -1 -2 1 -1
or if you want to avoid any branch:
purrr::accumulate(vector_A, function(a,b) { b a*(sign(a) == sign(b))})
[1] 1 2 -1 -2 -3 1 -1 -2 1 -1
CodePudding user response:
The approach that comes to mind is to find the runs (rle()
) defined by the
condition (sign()
) in the data, apply cumsum()
on each run separately
(tapply()
), and the concatenate back into a vector (unlist()
). Something
like this:
vector_A <- c(1, 1, -1, -1, -1, 1, -1, -1, 1, -1)
run_length <- rle(sign(vector_A))$lengths
run_id <- rep(seq_along(run_length), run_length)
unlist(tapply(vector_A, run_id, cumsum), use.names = FALSE)
#> [1] 1 2 -1 -2 -3 1 -1 -2 1 -1
Wrapping the process up a bit, I’d maybe put finding the grouping factor (run
index) in a function? And then the grouped summary will need to be done using
existing tools, like tapply()
above, or a creative ave()
, or in the
context of data frames, a group_by()
and summarise()
with dplyr.
run_index <- function(x) {
with(rle(x), rep(seq_along(lengths), lengths))
}
ave(vector_A, run_index(sign(vector_A)), FUN = cumsum)
#> [1] 1 2 -1 -2 -3 1 -1 -2 1 -1