I have a unbalanced panel data from a dataframe of transactions. I want to create a variable balance that sums the previously made transactions and thus reflects the balance of the wallet in a given period.
Head 40 looks like this:
period uniqueid ethamount date dollvalue ispurchase isunrealgain Freq day holdingtime
1 1 1 0 <NA> 0 <NA> <NA> <NA> NA <NA>
2 2 1 0 <NA> 0 <NA> <NA> <NA> NA <NA>
3 3 1 0 <NA> 0 <NA> <NA> <NA> NA <NA>
4 4 1 0 <NA> 0 <NA> <NA> <NA> NA <NA>
5 5 1 0 <NA> 0 <NA> <NA> <NA> NA <NA>
6 6 1 0 <NA> 0 <NA> <NA> <NA> NA <NA>
7 7 1 0.29486577590755 2017-12-10 -126.56818565055676 1 0 3 641 134
8 8 1 0.12565359813650237 2018-03-29 -48.266060116193295 1 0 3 750 134
9 8 1 -0.13903051539510708 2018-04-23 89.55372588145032 0 0 3 775 134
10 9 1 0 <NA> 0 <NA> <NA> <NA> NA <NA>
11 10 1 0 <NA> 0 <NA> <NA> <NA> NA <NA>
12 1 2 0 <NA> 0 <NA> <NA> <NA> NA <NA>
13 2 2 0 <NA> 0 <NA> <NA> <NA> NA <NA>
14 3 2 0 <NA> 0 <NA> <NA> <NA> NA <NA>
15 4 2 0 <NA> 0 <NA> <NA> <NA> NA <NA>
16 5 2 0 <NA> 0 <NA> <NA> <NA> NA <NA>
17 6 2 0 <NA> 0 <NA> <NA> <NA> NA <NA>
18 7 2 0.151 2018-01-24 -160.5432 1 0 4 686 196
19 7 2 0.093 2018-02-07 -70.04574 1 0 4 700 196
20 8 2 0 <NA> 0 <NA> <NA> <NA> NA <NA>
21 9 2 -0.22047692 2018-08-08 78.36631644479999 0 0 4 882 196
22 10 2 0.087 2018-09-25 -19.056479999999997 1 0 4 930 196
23 1 3 0 <NA> 0 <NA> <NA> <NA> NA <NA>
24 2 3 0 <NA> 0 <NA> <NA> <NA> NA <NA>
25 3 3 0 <NA> 0 <NA> <NA> <NA> NA <NA>
26 4 3 0 <NA> 0 <NA> <NA> <NA> NA <NA>
27 5 3 0 <NA> 0 <NA> <NA> <NA> NA <NA>
28 6 3 0 <NA> 0 <NA> <NA> <NA> NA <NA>
29 7 3 2.4 2017-11-30 -1037.3039999999999 1 0 3 631 35
30 7 3 -1 2018-01-04 940 0 0 3 666 35
31 8 3 0.92 2018-03-04 -796.2876 1 0 3 725 35
32 9 3 0 <NA> 0 <NA> <NA> <NA> NA <NA>
33 10 3 0 <NA> 0 <NA> <NA> <NA> NA <NA>
34 1 4 0 <NA> 0 <NA> <NA> <NA> NA <NA>
35 2 4 0 <NA> 0 <NA> <NA> <NA> NA <NA>
36 3 4 0 <NA> 0 <NA> <NA> <NA> NA <NA>
37 4 4 0 <NA> 0 <NA> <NA> <NA> NA <NA>
38 5 4 0 <NA> 0 <NA> <NA> <NA> NA <NA>
39 6 4 0 <NA> 0 <NA> <NA> <NA> NA <NA>
40 7 4 0.43 2017-11-05 -127.12089999999999 1 1 4 606 210
Simply using group_by and summarize does not work because it only outputs one value, I need however to create a variable balance
that adds all the ethamount values from previous rows period
by uniqueid
in order to have the wallet balance for the respective period
What I am trying to do would look somethig like this, however that code does not work
paneldata2$balance <- paneldata2 %>%
group_by(uniqueid) %>%
mutate(cumsum(paneldata2$ethamount)) %>%
ungroup()
Error in `mutate_cols()`:
! Problem with `mutate()` input `..1`.
i `..1 = cumsum(paneldata2$ethamount)`.
i `..1` must be size 11 or 1, not 529476.
i The error occurred in group 1: uniqueid = 1.
Is there an alternative way of doing what I want or can this code easily be fixed?
Thanks in advance
CodePudding user response:
You should write some functions differently, maybe you want this:
library(dplyr)
paneldata2 %>%
group_by(uniqueid) %>%
mutate(balance = cumsum(ethamount)) %>%
ungroup()
Output:
# A tibble: 40 × 11
period uniqueid ethamount date dollvalue ispurchase isunrealgain Freq day holdingtime balance
<int> <int> <dbl> <chr> <dbl> <chr> <chr> <chr> <int> <chr> <dbl>
1 1 1 0 <NA> 0 <NA> <NA> <NA> NA <NA> 0
2 2 1 0 <NA> 0 <NA> <NA> <NA> NA <NA> 0
3 3 1 0 <NA> 0 <NA> <NA> <NA> NA <NA> 0
4 4 1 0 <NA> 0 <NA> <NA> <NA> NA <NA> 0
5 5 1 0 <NA> 0 <NA> <NA> <NA> NA <NA> 0
6 6 1 0 <NA> 0 <NA> <NA> <NA> NA <NA> 0
7 7 1 0.295 2017-12-10 -127. 1 0 3 641 134 0.295
8 8 1 0.126 2018-03-29 -48.3 1 0 3 750 134 0.421
9 8 1 -0.139 2018-04-23 89.6 0 0 3 775 134 0.281
10 9 1 0 <NA> 0 <NA> <NA> <NA> NA <NA> 0.281
# … with 30 more rows