I would like to sum the value of "dollvalue" under the condition that ispurchase==1 is true, however I could not find an efficient solution. I tried solutions from other posts that all seemed somehow too complex and ended up not working. I tried to combine the plyr approach by combining group and aggregate but I get the error argument FUN is missing.
library(plyr)
returntrip <- roundtrips %>%
group_by(id) %>%
aggregate(purchcost = sum(dollvalue[ispurchase==1],
FUN = sum)) %>%
ungroup
Also I tried to simply agregate it and I think it almost works but I get the following error: Error in aggregate.data.frame(as.data.frame(x), ...) : arguments must have same length
I assume because the list and the data frame have not the same length. Is there any way to fix this?
returntrip <- aggregate(x = roundtrips$dollvalue[roundtrips$ispurchase==1],
by = list(roundtrips$id),
FUN = sum)
This is how a snippet of the dataframe looks like:
head looks like that:
ethamount dollvalue id ispurchase dollarcum
1: 0.0000877963125548729991613761125535 -0.0010491659350307322180057001403952 883 1 0.000000000000000000
2: 0.0010000000000000000208166817117217 -0.0107400000000000012817524819297432 36927 1 0.000000000000000000
3: 75.4154000000000053205440053716301918 -804.6823180000000093059497885406017303 2637 1 0.000000000000000000
4: 0.1066286798619889564232465772875003 -1.0662867986198896197436170041328296 72274 1 0.000000000000000000
5: 0.0100000000000000002081668171172169 -0.1000000000000000055511151231257827 94359 1 0.010899999999999993
6: 0.1000000000000000055511151231257827 -0.9460000000000001740829702612245455 3083 1 0.000000000000000000
7: 1.0000000000000000000000000000000000 -9.3499999999999996447286321199499071 102645 1 0.000000000000000000
8: 0.0000000000000000010000000000000001 -0.0000000000000000098900000000000005 117464 1 0.000000000000000000
9: 0.0100000000000000002081668171172169 -0.1108999999999999985789145284797996 91239 1 -0.010899999999999993
10: 12.0000000000000000000000000000000000 -144.9600000000000079580786405131220818 52894 1 0.000000000000000000
11: 14.7899999999999991473487170878797770 -207.0600000000000022737367544323205948 80993 1 0.000000000000000000
12: 55.2299999999999968736119626555591822 -689.2703999999999950887286104261875153 74580 1 0.000000000000000000
13: 0.1000000000000000055511151231257827 -1.2480000000000002202682480856310576 116147 1 0.000000000000000000
14: 1.9995590000000000863167315401369706 -37.4517400699999996049882611259818077 36943 1 0.000000000000000000
15: 0.3914821535012809605724726225162158 -5.5786206873932533412130396754946560 86862 1 0.000000000000000000
16: 0.4893235858000000160217268785345368 -6.3122742568200003177025791956111789 88279 1 0.000000000000000000
17: 0.0001392130443151549901940194908789 -0.0016510667055777380248654528926977 72433 1 0.000000000000000000
18: 0.1000000000000000055511151231257827 -1.0160000000000000142108547152020037 68487 1 0.000000000000000000
19: 0.7211898100000000422227230956195854 -8.3946493884000012997148587601259351 28354 1 0.000000000000000000
20: 0.6650000000000000355271367880050093 -8.0265500000000002955857780762016773 80397 1 0.000000000000000000
Many thanks for any type of hint or solution.
CodePudding user response:
Try the following code where you subset your data with a condition:
library(dplyr)
df %>%
group_by(id) %>%
summarise(
purchcost = sum(dollvalue[ispurchase == 1]), .groups = "drop")
Output:
# A tibble: 20 × 2
id purchcost
<int> <dbl>
1 883 -1.05e- 3
2 2637 -8.05e 2
3 3083 -9.46e- 1
4 28354 -8.39e 0
5 36927 -1.07e- 2
6 36943 -3.75e 1
7 52894 -1.45e 2
8 68487 -1.02e 0
9 72274 -1.07e 0
10 72433 -1.65e- 3
11 74580 -6.89e 2
12 80397 -8.03e 0
13 80993 -2.07e 2
14 86862 -5.58e 0
15 88279 -6.31e 0
16 91239 -1.11e- 1
17 94359 -1 e- 1
18 102645 -9.35e 0
19 116147 -1.25e 0
20 117464 -9.89e-18