I'm trying to go through a column and create a secondary column called status. Status is based on a condition of times. If times is >250 then status should be assigned a "good", if not then the current times row should be summed (similar to cumsum) to rows below until the point where the running_sum is >250. At this point the status of the current row should be changed to good and everything starts afresh.
I've tried the for loop below but I can't get it to work (for instance 3rd row status should be good in the example). Can someone provide an example of the above and explain how it works please? Thank you.
set.seed(1234)
test = data.frame(times = round(abs(rnorm(20,100,100)),0))
test
#> times
#> 1 21
#> 2 128
#> 3 208
#> 4 135
#> 5 143
#> 6 151
#> 7 43
#> 8 45
#> 9 44
#> 10 11
#> 11 52
#> 12 0
#> 13 22
#> 14 106
#> 15 196
#> 16 89
#> 17 49
#> 18 9
#> 19 16
#> 20 342
test$status <- 'bad'
running_sum <- 0
for (i in 1:length(test$times)) {
if (test$times[i] >= 250 | running_sum > 250) {
test$status[i] <- "good"
running_sum <- 0
} else {
running_sum <- running_sum test$times[i]
}
print(running_sum)
}
#> [1] 21
#> [1] 149
#> [1] 357
#> [1] 0
#> [1] 143
#> [1] 294
#> [1] 0
#> [1] 45
#> [1] 89
#> [1] 100
#> [1] 152
#> [1] 152
#> [1] 174
#> [1] 280
#> [1] 0
#> [1] 89
#> [1] 138
#> [1] 147
#> [1] 163
#> [1] 0
test
#> times status
#> 1 21 bad
#> 2 128 bad
#> 3 208 bad
#> 4 135 good
#> 5 143 bad
#> 6 151 bad
#> 7 43 good
#> 8 45 bad
#> 9 44 bad
#> 10 11 bad
#> 11 52 bad
#> 12 0 bad
#> 13 22 bad
#> 14 106 bad
#> 15 196 good
#> 16 89 bad
#> 17 49 bad
#> 18 9 bad
#> 19 16 bad
#> 20 342 good
CodePudding user response:
using this nice answer from @MrFlick,
set.seed(1234)
test = data.frame(times = round(abs(rnorm(20,100,100)),0))
sum_reset_at <- function(thresh) {
function(x) {
accumulate(x, ~if_else(.x>=thresh, .y, .x .y))
}
}
library(tidyverse)
test %>% mutate(temp = ifelse(sum_reset_at(250)(times) < 250, "bad", "good"))
# times temp
# 1 21 bad
# 2 128 bad
# 3 208 good
# 4 135 bad
# 5 143 good
# 6 151 bad
# 7 43 bad
# 8 45 bad
# 9 44 good
# 10 11 bad
# 11 52 bad
# 12 0 bad
# 13 22 bad
# 14 106 bad
# 15 196 good
# 16 89 bad
# 17 49 bad
# 18 9 bad
# 19 16 bad
# 20 342 good
CodePudding user response:
You just need to change the order of your loop operations: increment first, then test.
set.seed(1234)
test = data.frame(times = round(abs(rnorm(20,100,100)),0))
test$status <- 'bad'
running_sum <- 0
for (i in 1:length(test$times)) {
running_sum <- running_sum test$times[i]
print(running_sum)
if (test$times[i] >= 250 | running_sum > 250) {
test$status[i] <- "good"
running_sum <- 0
}
}
Result:
times status
1 21 bad
2 128 bad
3 208 good
4 135 bad
5 143 good
6 151 bad
7 43 bad
8 45 bad
9 44 good
10 11 bad
11 52 bad
12 0 bad
13 22 bad
14 106 bad
15 196 good
16 89 bad
17 49 bad
18 9 bad
19 16 bad
20 342 good