Home > Mobile >  How to easily create multiple new variables with cumulative sum from a list in R
How to easily create multiple new variables with cumulative sum from a list in R

Time:12-06

For a project I am working on I need to create new variables with the cumulative sum of alle the values in a list. My list has about 2000 different entries (Apps) with the total downloads per day for a year. To correctly analyse this further I therefore need the cumulative sums put into a new variable. I know how to do this for one single instance, that would be apps$cum_app1 <- cumsum(apps$app1) but to do this manually for all 2000 apps is going to be too much of a hassle.

I have a small sample here as example:

apps <- list(App1 = c(23000, 15488, 45228, 48599, 46524),
         App2 = c(65465, 1435, 6848, 68466),
         App3 = c(45648, 564, 65848, 6546),
         App4 = c(654, 64689, 65433))

Generally I would use the following:

apps <- as.data.frame(apps)
apps <- apps %>% 
  mutate_all(list(c = ~ cumsum(.)))
apps <- as.list(apps)

But the different variables have different number of rows so this is not possible.

I need the output to stay in the list format as it's necessary for further analysis.

I was thinking of creating a for loop to do this but then I am not sure as to how to exactly do this. I would therefore like the new variables to get names as: App1_cum and then the cumulative sums. Can anyone help me please?

CodePudding user response:

Here is a base R way.
First compute the cumulative sums in a lapply loop. Then set the new list's names. And append the result to the original list. In the end, tidy up by removing the temporary list.

tmp <- lapply(apps, cumsum)
names(tmp) <- paste("cum", names(apps), sep = "_")
apps <- append(apps, tmp)
rm(tmp)

CodePudding user response:

Using dplyr and purrr verbs, you could do:

apps %>%
  map(., ~ as_tibble(.) %>% 
        mutate(cumsum = cumsum(.x))) %>% 
  imap(., function(x, y) x %>% rename_with(~ paste0(., "_", y)))

Which gives the desired output:

$App1
# A tibble: 5 x 2
  value_App1 cumsum_App1
       <dbl>       <dbl>
1      23000       23000
2      15488       38488
3      45228       83716
4      48599      132315
5      46524      178839

$App2
# A tibble: 4 x 2
  value_App2 cumsum_App2
       <dbl>       <dbl>
1      65465       65465
2       1435       66900
3       6848       73748
4      68466      142214

$App3
# A tibble: 4 x 2
  value_App3 cumsum_App3
       <dbl>       <dbl>
1      45648       45648
2        564       46212
3      65848      112060
4       6546      118606

$App4
# A tibble: 3 x 2
  value_App4 cumsum_App4
       <dbl>       <dbl>
1        654         654
2      64689       65343
3      65433      130776
  •  Tags:  
  • r
  • Related