i have a data frame that looks like this :
date | y |
---|---|
2021-12-01 | 1 |
2021-12-02 | 2 |
2021-12-03 | 3 |
2021-12-04 | 4 |
2021-12-05 | 5 |
2021-12-06 | 6 |
2022-01-01 | 10 |
2022-01-02 | 20 |
2022-01-03 | 30 |
2022-01-04 | 40 |
2022-01-05 | 50 |
2022-01-06 | 60 |
2022-02-01 | 0.1 |
2022-02-02 | 0.2 |
2022-02-03 | 0.3 |
2022-02-04 | 0.4 |
2022-02-05 | 0.5 |
2022-02-06 | 0.6 |
2022-03-01 | 0.01 |
2022-03-02 | 0.02 |
2022-03-03 | 0.03 |
2022-03-04 | 0.04 |
2022-03-05 | 0.05 |
2022-03-06 | 0.06 |
i want to group them by month but i want to calculate the minimum of each month but as month forwards to contain the information from all the previous history . Like an expanding by month. For example the minimum of month 12 of year 2021 is 1 and has 6 obs.The next month 1 of year 2022 has minimum 1 because contains the information from december 2021 and has 12 obs.And so on and so forth .
Ideally i want the resulted data frame to be like this :
year | month | obs | minimum |
---|---|---|---|
2021 | 12 | 6 | 1 |
2022 | 1 | 12 | 1 |
2022 | 2 | 18 | 0.1 |
2022 | 3 | 24 | 0.01 |
How can i do this in R ?
date = c(seq(as.Date("2021/12/1"), by = "day", length.out = 6),
seq(as.Date("2022/1/1"), by = "day", length.out = 6),
seq(as.Date("2022/2/1"), by = "day", length.out = 6),
seq(as.Date("2022/3/1"), by = "day", length.out = 6))
y = c(seq(1:6),seq(10,60,by=10),seq(0.1,0.6,by=0.1),seq(0.01,0.06,by=0.01))
df = tibble(date,y);df
CodePudding user response:
Does this achieve what you're looking for?
library(tidyverse)
library(lubridate)
df |>
mutate(year = year(date), month = month(date)) |>
group_by(year, month) |>
summarise(obs = n(), minimum = min(y),
.groups = "drop") |>
mutate(obs = cumsum(obs),
minimum = cummin(minimum))
#> # A tibble: 4 × 4
#> year month obs minimum
#> <dbl> <dbl> <int> <dbl>
#> 1 2021 12 6 1
#> 2 2022 1 12 1
#> 3 2022 2 18 0.1
#> 4 2022 3 24 0.01