Home > Enterprise >  Iterative partial sum on rows with the same dates in R
Iterative partial sum on rows with the same dates in R

Time:11-24

I would like to do some computation on several rows in a table. I created an exemple below:

  library(dplyr)

  year_week <- c(200045:200053, 200145:200152, 200245:200252)
  input <- as.vector(sample(1:10,25,TRUE))
  partial_sum <- c(13, 11, 8, 15, 14, 9, 11, 3, 3, 9, 12, 16, 17, 13, 16, 11, 9, 16, 19, 10, 16, 15, 11, 6, 8)
  df <- data.frame(year_week, input, partial_sum)

Given are the columns input and year_week. The later represents dates but the values are numerical in my case with the first 4 digits as years and the last two as the working weeks for that year. What I need, is to iterate over each week in each year and to sum up the values from the same weeks in the other years and save the results into a column called here partial_sum. The current value is excluded from the sum. The week 53 in the lap year 2000 will get the same treatment but in this case I have only one lap year therefore its value 3 doesn't change.

Any idea on how to make it? Thank you

CodePudding user response:

I would expect something like this would work, though as pointed out in comments your example isn't exactly reproducible.

library(dplyr)
df %>%
  mutate(week = substr(year_week, 5, 6)) %>%
  group_by(week) %>%
  mutate(result = sum(input))

CodePudding user response:

Perhaps this helps - grouped by 'week' by taking the substring, get the difference between the sum of 'input' and the 'input'

library(dplyr)
df %>% 
   group_by(week = substring(year_week, 5)) %>%
   mutate(partial_sum2 = sum(input) - input)
  • Related