I have the following data where e_in is exogenous giving. Ann then is an equal distribution of e_in, however, e_in can only be distributed downwards, i.e. a string (this is why 7 and 8 has ann=9 while 1 to 6 have ann=8.5)
e_in<-c(13,10,4,9,14,1,11,7)
ann<-c(8.5,8.5,8.5,8.5,8.5,8.5,9,9)
Dat_1<-data.frame(e_in,ann)
>Dat_1
e_in ann
1 13 8.5
2 10 8.5
3 4 8.5
4 9 8.5
5 14 8.5
6 1 8.5
7 11 9.0
8 7 9.0
I would now like to calculate how much e_in is available at each point down the string (shown as smn). So for 1 there is 13 e_in avabile, where 1 will take 8.5. Number 2 will then have own e_in whatever is send downwards form 1 (here 10 (13-8.5) = 14.5) and so on.
As the following:
smn<-c(13,14.5,10,10.5,16,8.5,11,9)
Dat_2<-data.frame(e_in,ann,smn)
>Dat_2
e_in ann smn
1 13 8.5 13.0
2 10 8.5 14.5
3 4 8.5 10.0
4 9 8.5 10.5
5 14 8.5 16.0
6 1 8.5 8.5
7 11 9.0 11.0
8 7 9.0 9.0
Is there any easy way/package for this sort of calculation (I have done it ‘by hand’ for this example but it becomes significantly more time consuming with bigger strings.)
CodePudding user response:
I think you just need the cumulative sum ofe_in
minus the lagged cumulative sum of ann
Dat_1$smn <- cumsum(Dat_1$e_in) - cumsum(c(0, head(Dat_1$ann, -1)))
Dat_1
# e_in ann smn
# 1 13 8.5 13.0
# 2 10 8.5 14.5
# 3 4 8.5 10.0
# 4 9 8.5 10.5
# 5 14 8.5 16.0
# 6 1 8.5 8.5
# 7 11 9.0 11.0
# 8 7 9.0 9.0
CodePudding user response:
I tried to think of a lag
& window
solution but couldn't so curious to see if anyone else managed it.
In place of that here's a loop that can do it:
Dat_1['smn'] = c(Dat_1[1, 'e_in'])
for (i in 2:nrow(Dat_1)){
Dat_1[i, 'smn'] <- Dat_1[i, 'e_in'] Dat_1[i-1, 'smn'] - Dat_1[i-1, 'ann']
}
e_in ann smn
1 13 8.5 13.0
2 10 8.5 14.5
3 4 8.5 10.0
4 9 8.5 10.5
5 14 8.5 16.0
6 1 8.5 8.5
7 11 9.0 11.0
8 7 9.0 9.0
EDIT
Just seen Allan Cameron's answer which inspired me to correct it using dplyr
Dat_1 %>%
mutate(
smn = cumsum(e_in) - cumsum(lag(ann, n = 1L, default = 0))
)
Same result