Home > Back-end >  Using cumsum for panel data in r
Using cumsum for panel data in r

Time:07-04

I have a unbalanced panel data from a dataframe of transactions. I want to create a variable balance that sums the previously made transactions and thus reflects the balance of the wallet in a given period.

Head 40 looks like this:

period uniqueid            ethamount       date           dollvalue ispurchase isunrealgain Freq day holdingtime
1       1        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
2       2        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
3       3        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
4       4        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
5       5        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
6       6        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
7       7        1     0.29486577590755 2017-12-10 -126.56818565055676          1            0    3 641         134
8       8        1  0.12565359813650237 2018-03-29 -48.266060116193295          1            0    3 750         134
9       8        1 -0.13903051539510708 2018-04-23   89.55372588145032          0            0    3 775         134
10      9        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
11     10        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
12      1        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
13      2        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
14      3        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
15      4        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
16      5        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
17      6        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
18      7        2                0.151 2018-01-24           -160.5432          1            0    4 686         196
19      7        2                0.093 2018-02-07           -70.04574          1            0    4 700         196
20      8        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
21      9        2          -0.22047692 2018-08-08   78.36631644479999          0            0    4 882         196
22     10        2                0.087 2018-09-25 -19.056479999999997          1            0    4 930         196
23      1        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
24      2        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
25      3        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
26      4        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
27      5        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
28      6        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
29      7        3                  2.4 2017-11-30 -1037.3039999999999          1            0    3 631          35
30      7        3                   -1 2018-01-04                 940          0            0    3 666          35
31      8        3                 0.92 2018-03-04           -796.2876          1            0    3 725          35
32      9        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
33     10        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
34      1        4                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
35      2        4                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
36      3        4                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
37      4        4                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
38      5        4                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
39      6        4                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
40      7        4                 0.43 2017-11-05 -127.12089999999999          1            1    4 606         210

Simply using group_by and summarize does not work because it only outputs one value, I need however to create a variable balance that adds all the ethamount values from previous rows period by uniqueid in order to have the wallet balance for the respective period

What I am trying to do would look somethig like this, however that code does not work

paneldata2$balance <- paneldata2 %>% 
  group_by(uniqueid) %>%
  mutate(cumsum(paneldata2$ethamount)) %>%
  ungroup()

Error in `mutate_cols()`:
! Problem with `mutate()` input `..1`.
i `..1 = cumsum(paneldata2$ethamount)`.
i `..1` must be size 11 or 1, not 529476.
i The error occurred in group 1: uniqueid = 1.

Is there an alternative way of doing what I want or can this code easily be fixed?

Thanks in advance

CodePudding user response:

You should write some functions differently, maybe you want this:

library(dplyr)
paneldata2 %>% 
  group_by(uniqueid) %>%
  mutate(balance = cumsum(ethamount)) %>%
  ungroup()

Output:

# A tibble: 40 × 11
   period uniqueid ethamount date       dollvalue ispurchase isunrealgain Freq    day holdingtime balance
    <int>    <int>     <dbl> <chr>          <dbl> <chr>      <chr>        <chr> <int> <chr>         <dbl>
 1      1        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0    
 2      2        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0    
 3      3        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0    
 4      4        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0    
 5      5        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0    
 6      6        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0    
 7      7        1     0.295 2017-12-10    -127.  1          0            3       641 134           0.295
 8      8        1     0.126 2018-03-29     -48.3 1          0            3       750 134           0.421
 9      8        1    -0.139 2018-04-23      89.6 0          0            3       775 134           0.281
10      9        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0.281
# … with 30 more rows
  • Related