Home > Back-end >  How to group non-ecluding year ranges using a loop with dplyr
How to group non-ecluding year ranges using a loop with dplyr

Time:10-28

I'm new here, so maybe my question could be difficult to understand. So, I have some data and it's date information and I need to group the mean of the data in year ranges. But this year ranges are non-ecluding, I mean that, for example, my first range is: 2013-2015 then 2014-2016 then 2015-2017, etc. So I think that it could be done by using a loop function and dplyr, but I dont know how to do it. I´ll be very thankfull if someone can help me.

Thank you,

Alejandro

What I tried was like:

for (i in Year){
Year_3=c(i, i 1, i 2)
db>%> group_by(Year_3)
#....etc
}

CodePudding user response:

As you note, each observation would be used in multiple groups, so one approach could be to make copies of your data accordingly:

df <- data.frame(year = 2013:2020, value = 1:8)

library(dplyr)
df %>% 
  tidyr::uncount(3, .id = "grp") %>%
  mutate(group_start = year - grp   1,
         group_name = paste0(group_start, "-", group_start   2)) %>%
  group_by(group_name) %>%
  summarise(value = mean(value),
            n = n())


# A tibble: 10 × 3
   group_name value     n
   <chr>      <dbl> <int>
 1 2011-2013    1       1
 2 2012-2014    1.5     2
 3 2013-2015    2       3
 4 2014-2016    3       3
 5 2015-2017    4       3
 6 2016-2018    5       3
 7 2017-2019    6       3
 8 2018-2020    7       3
 9 2019-2021    7.5     2
10 2020-2022    8       1

Or we might take a more algebraic approach, noting that the sum of a three year period will be the difference between the cumulative amount two years in the future minus the cumulative amount the prior year. This approach excludes the partial ranges.

df %>%
  mutate(cuml = cumsum(value),
         value_3yr = (lead(cuml, n = 2) - lag(cuml, default = 0)) / 3)

  year value cuml value_3yr
1 2013     1    1         2
2 2014     2    3         3
3 2015     3    6         4
4 2016     4   10         5
5 2017     5   15         6
6 2018     6   21         7
7 2019     7   28        NA
8 2020     8   36        NA
  • Related