I am trying to create a group variable based on cumulative sum of another variable. I want to apply a constraint on the cumulative sum if it goes beyond a limit (15000000) then group variable should change. Here is the code that I am working on:-
myDat = data.frame(Seg = c("A","B","C","D","F","G","H"),
Freq =c(4558848, 10926592, 15783936,8266496,7729349,13234562,9873456))
myDat$csum <- ceiling(ave(myDat$Freq,FUN=cumsum)/15000000)
# Seg Freq csum
# A 4558848 1
# B 10926592 2
# C 15783936 3
# D 8266496 3
# F 7729349 4
# G 13234562 5
# H 9873456 5
myDat1 <- aggregate(Freq~csum, data=myDat, FUN = sum)
# csum Freq
# 1 4558848
# 2 10926592
# 3 24050432
# 4 7729349
# 5 23108018
Some of the groups have gone beyond 15000000 limit. Can anyone help me with this code?
# Desired Results:-
# Seg Freq csum Desired csum
# A 4558848 1 1
# B 10926592 2 2
# C 15783936 3 3
# D 8266496 3 4
# F 6229349 4 4
# G 13234562 4 5
# H 9873456 5 6
CodePudding user response:
I believe you want cumsum(Freq > 1e7)
.
with(myDat, aggregate(list(Freq=Freq), list(csum=cumsum(Freq > 1e7) 1), sum))
# csum Freq
# 1 1 4558848
# 2 2 10926592
# 3 3 31779781
# 4 4 23108018
transform(myDat, csum=cumsum(Freq > 1e7) 1)
# Seg Freq csum
# 1 A 4558848 1
# 2 B 10926592 2
# 3 C 15783936 3
# 4 D 8266496 3
# 5 F 7729349 3
# 6 G 13234562 4
# 7 H 9873456 4
Data:
myDat <- structure(list(Seg = c("A", "B", "C", "D", "F", "G", "H"), Freq = c(4558848,
10926592, 15783936, 8266496, 7729349, 13234562, 9873456)), class = "data.frame", row.names = c(NA,
-7L))
CodePudding user response:
I am able to find an answer to it, credit to the link .
myDat %>% mutate(cumsum_15 = accumulate(Freq, ~ifelse(.x .y <= 15000000, .x .y, .y)),
group_15 = cumsum(Freq == cumsum_10))