Let's say I have the following dataframe:
A B C
1 15 14 12
2 7 1 6
3 8 22 5
4 11 5 1
5 4 12 4
I want to calculate the difference between the rows and then divide the difference by the value of the previous row. This is done for each variable.
The result would be something like this:
A B C A_r B_r C_r
1 15 14 12 NA NA NA
2 7 1 6 -0.53 -0.93 -0.50
3 8 22 5 0.14 21 -0.16
4 11 5 1 ... ... ...
5 4 12 4 ... ... ...
The general formula would be:
R(n) = [S(n) - S(n-1)] / S(n-1)
Where R represents the newly calculated variable and S represents the current variable the value R is being calculated for (A, B, C in this example).
I know I can use the diff
function to calculate the difference but I don't know how I'd divide that difference by the values of previous rows.
CodePudding user response:
We can use across
with lag
- loop across
all the columns (everything()
), apply the formula, and create new columns by modifying the .names
- i.e. adding suffix _r
with the corresponding column names ({.col}
)
library(dplyr)
df1 <- df1 %>%
mutate(across(everything(), ~ (. - lag(.))/lag(.),
.names = "{.col}_r"))
-output
df1
A B C A_r B_r C_r
1 15 14 12 NA NA NA
2 7 1 6 -0.5333333 -0.9285714 -0.5000000
3 8 22 5 0.1428571 21.0000000 -0.1666667
4 11 5 1 0.3750000 -0.7727273 -0.8000000
5 4 12 4 -0.6363636 1.4000000 3.0000000
Or use base R
with diff
df1[paste0(names(df1), "_r")] <- rbind(NA,
diff(as.matrix(df1)))/rbind(NA, df1[-nrow(df1),])