I want to populate an existing column with values that continually add onto the row above. This is easy in Excel, but I haven't figured out a good way to automate it in R.
If we had 2 columns in Excel, A and B, we want cell B2 to =B1 A2, and cell B3 would = B2 A3. How can I do this in R?
#example dataframe
df <- data.frame(A = 0:9, B = c(50,0,0,0,0,0,0,0,0,0))
#desired output
desired <- data.frame(A = 0:9, B = c("NA",51,53,56,60,65,71,78,86,95))
I tried using the lag() function, but it didn't give the correct output.
df <- df %>%
mutate(B = B lag(A))
So I made a for loop that works, but I feel like there's a better solution.
for(i in 2:nrow(df)){
df$B[i] <- df$B[i-1] df$A[i]
}
Eventually, I want to iterate this function over every n rows of the whole dataframe, essentially so the summation resets every n rows. (any tips on how to do that would be greatly appreciated!)
CodePudding user response:
cumsum()
can be used to get the result you need.
df$B <- cumsum(df$B df$A)
df
A B
1 0 50
2 1 51
3 2 53
4 3 56
5 4 60
6 5 65
7 6 71
8 7 78
9 8 86
10 9 95
CodePudding user response:
This might be close to what you need, and uses tidyverse
. Specifically, it uses accumulate
from purrr
.
Say you want to reset to zero every n
rows, you can also use group_by
ahead of time.
It was not entirely clear how you'd like to handle the first row; here, it will just use the first B
value and ignore the first A
value, which looked similar to what you had in the post.
n <- 5
library(tidyverse)
df %>%
group_by(grp = ceiling(row_number() / n)) %>%
mutate(B = accumulate(A[-1], sum, .init = B[1]))
Output
A B grp
<int> <dbl> <dbl>
1 0 50 1
2 1 51 1
3 2 53 1
4 3 56 1
5 4 60 1
6 5 0 2
7 6 6 2
8 7 13 2
9 8 21 2
10 9 30 2