I have a dataframe looking like this:
date | P | >60? |
---|---|---|
03-31-2020 | 6.8 | 0 |
03-30-2020 | 5.0 | 0 |
03-29-2020 | 0.0 | 0 |
03-28-2020 | 0.0 | 0 |
03-27-2020 | 2.0 | 0 |
03-26-2020 | 0.0 | 0 |
03-25-2020 | 71.0 | 1 |
03-24-2020 | 2.0 | 0 |
03-23-2020 | 0.0 | 0 |
03-22-2020 | 23.8 | 0 |
03-21-2020 | 0.0 | 0 |
03-20-2020 | 23.8 | 0 |
- Code to reproduce the
dataframe
:
df1 <- data.frame(date = c("03-31-2020", "03-30-2020", "03-29-2020", "03-28-2020", "03-27-2020", "03-26-2020",
"03-25-2020", "03-24-2020", "03-23-2020", "03-22-2020", "03-21-2020", "03-20-2020"),
P = c(6.8, 5.0, 0.0, 0.0, 2.0, 0.0, 71.0, 2.0, 0.0, 23.8, 0.0, 23.8),
Sup60 = c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0))
I want to sum the P values N days befores the P > 60. For example, the first barrier (number bigger than 60) is the P = 71 on the day 25-03-2020, from that i want to sum the 5 P values before that day, like:
2.0 0.0 23.8 0.0 23.8 = 49,6
It is a kind of moving sum because the concept is similar to a moving average. Instead of the average of the last 5 values, for example, I want the sum of the last 5 values from a value greater than 60. How can I do this?
CodePudding user response:
Hi firstly we can solve how to calculate a running sum then we do an if_else on this column, as a general rule you always split complex problems into minor solvable problems
library(tidyverse)
df_example <- tibble::tribble(
~date, ~P, ~`>60?`,
"03-31-2020", 6.8, 0L,
"03-30-2020", 5, 0L,
"03-29-2020", 0, 0L,
"03-28-2020", 0, 0L,
"03-27-2020", 2, 0L,
"03-26-2020", 0, 0L,
"03-25-2020", 71, 1L,
"03-24-2020", 2, 0L,
"03-23-2020", 0, 0L,
"03-22-2020", 23.8, 0L,
"03-21-2020", 0, 0L,
"03-20-2020", 23.8, 0L
)
# lets start by doing a simple running sum
jjj <- df_example |>
arrange(date)
jjj |>
mutate(running_sum = slider::slide_dbl(.x = P,.f = ~ sum(.x),.before = 5,.after = -1)) |>
mutate(chosen_sum = if_else(P > 60,running_sum,NA_real_))
#> # A tibble: 12 x 5
#> date P `>60?` running_sum chosen_sum
#> <chr> <dbl> <int> <dbl> <dbl>
#> 1 03-20-2020 23.8 0 0 NA
#> 2 03-21-2020 0 0 23.8 NA
#> 3 03-22-2020 23.8 0 23.8 NA
#> 4 03-23-2020 0 0 47.6 NA
#> 5 03-24-2020 2 0 47.6 NA
#> 6 03-25-2020 71 1 49.6 49.6
#> 7 03-26-2020 0 0 96.8 NA
#> 8 03-27-2020 2 0 96.8 NA
#> 9 03-28-2020 0 0 75 NA
#> 10 03-29-2020 0 0 75 NA
#> 11 03-30-2020 5 0 73 NA
#> 12 03-31-2020 6.8 0 7 NA
Created on 2021-10-20 by the reprex package (v2.0.1)