Adding numbers with condition R-CodePudding

I have the following data where e_in is exogenous giving. Ann then is an equal distribution of e_in, however, e_in can only be distributed downwards, i.e. a string (this is why 7 and 8 has ann=9 while 1 to 6 have ann=8.5)

e_in<-c(13,10,4,9,14,1,11,7)
ann<-c(8.5,8.5,8.5,8.5,8.5,8.5,9,9)
Dat_1<-data.frame(e_in,ann)
>Dat_1
  e_in ann
1   13 8.5
2   10 8.5
3    4 8.5
4    9 8.5
5   14 8.5
6    1 8.5
7   11 9.0
8    7 9.0

I would now like to calculate how much e_in is available at each point down the string (shown as smn). So for 1 there is 13 e_in avabile, where 1 will take 8.5. Number 2 will then have own e_in whatever is send downwards form 1 (here 10 (13-8.5) = 14.5) and so on.

As the following:

smn<-c(13,14.5,10,10.5,16,8.5,11,9)
Dat_2<-data.frame(e_in,ann,smn)

>Dat_2
  e_in ann  smn
1   13 8.5 13.0
2   10 8.5 14.5
3    4 8.5 10.0
4    9 8.5 10.5
5   14 8.5 16.0
6    1 8.5  8.5
7   11 9.0 11.0
8    7 9.0  9.0

Is there any easy way/package for this sort of calculation (I have done it ‘by hand’ for this example but it becomes significantly more time consuming with bigger strings.)

CodePudding user response：

I think you just need the cumulative sum ofe_in minus the lagged cumulative sum of ann

Dat_1$smn <- cumsum(Dat_1$e_in) - cumsum(c(0, head(Dat_1$ann, -1)))

Dat_1
#   e_in ann  smn
# 1   13 8.5 13.0
# 2   10 8.5 14.5
# 3    4 8.5 10.0
# 4    9 8.5 10.5
# 5   14 8.5 16.0
# 6    1 8.5  8.5
# 7   11 9.0 11.0
# 8    7 9.0  9.0

CodePudding user response：

I tried to think of a lag & window solution but couldn't so curious to see if anyone else managed it.

In place of that here's a loop that can do it:

Dat_1['smn'] = c(Dat_1[1, 'e_in'])
for (i in  2:nrow(Dat_1)){
  Dat_1[i, 'smn'] <- Dat_1[i, 'e_in']   Dat_1[i-1, 'smn'] - Dat_1[i-1, 'ann']
}

  e_in ann  smn
1   13 8.5 13.0
2   10 8.5 14.5
3    4 8.5 10.0
4    9 8.5 10.5
5   14 8.5 16.0
6    1 8.5  8.5
7   11 9.0 11.0
8    7 9.0  9.0

EDIT

Just seen Allan Cameron's answer which inspired me to correct it using dplyr

Dat_1 %>%
  mutate(
    smn = cumsum(e_in) - cumsum(lag(ann, n = 1L, default = 0))
  )

Same result