For a project I am working on I need to create new variables with the cumulative sum of alle the values in a list. My list has about 2000 different entries (Apps) with the total downloads per day for a year. To correctly analyse this further I therefore need the cumulative sums put into a new variable.
I know how to do this for one single instance, that would be apps$cum_app1 <- cumsum(apps$app1)
but to do this manually for all 2000 apps is going to be too much of a hassle.
I have a small sample here as example:
apps <- list(App1 = c(23000, 15488, 45228, 48599, 46524),
App2 = c(65465, 1435, 6848, 68466),
App3 = c(45648, 564, 65848, 6546),
App4 = c(654, 64689, 65433))
Generally I would use the following:
apps <- as.data.frame(apps)
apps <- apps %>%
mutate_all(list(c = ~ cumsum(.)))
apps <- as.list(apps)
But the different variables have different number of rows so this is not possible.
I need the output to stay in the list format as it's necessary for further analysis.
I was thinking of creating a for loop to do this but then I am not sure as to how to exactly do this. I would therefore like the new variables to get names as: App1_cum and then the cumulative sums. Can anyone help me please?
CodePudding user response:
Here is a base R way.
First compute the cumulative sums in a lapply
loop. Then set the new list's names. And append
the result to the original list. In the end, tidy up by removing the temporary list.
tmp <- lapply(apps, cumsum)
names(tmp) <- paste("cum", names(apps), sep = "_")
apps <- append(apps, tmp)
rm(tmp)
CodePudding user response:
Using dplyr
and purrr
verbs, you could do:
apps %>%
map(., ~ as_tibble(.) %>%
mutate(cumsum = cumsum(.x))) %>%
imap(., function(x, y) x %>% rename_with(~ paste0(., "_", y)))
Which gives the desired output:
$App1
# A tibble: 5 x 2
value_App1 cumsum_App1
<dbl> <dbl>
1 23000 23000
2 15488 38488
3 45228 83716
4 48599 132315
5 46524 178839
$App2
# A tibble: 4 x 2
value_App2 cumsum_App2
<dbl> <dbl>
1 65465 65465
2 1435 66900
3 6848 73748
4 68466 142214
$App3
# A tibble: 4 x 2
value_App3 cumsum_App3
<dbl> <dbl>
1 45648 45648
2 564 46212
3 65848 112060
4 6546 118606
$App4
# A tibble: 3 x 2
value_App4 cumsum_App4
<dbl> <dbl>
1 654 654
2 64689 65343
3 65433 130776