I am trying to find Percent change to 2 successive values.
df = data.frame(k = c(2,3,4,5,6,7,8,9,10,11,12,13,14,15),
rmse = c(61.41,58.7,56.84,55.6,54.61,53.98,53.5,53.04,52.66,52.39,52.19,52.04,51.9,51.8))
I want another column with percent change calculated for two succesive values. For example: The percent change for K= 4 should be calculated with K =2, similarly K = 5 with K= 3 and so on.
I hope i am clear. Anybody got an idea on how to solve this?
CodePudding user response:
You can use lag
from dplyr
. Obviously your first two values will be NA
since you don't have a k = 0 or k = -1 to assess the percentage change for these.
df %>% mutate(new_col = scales::percent(lag(rmse, 2)/rmse))
#> k rmse new_col
#> 1 2 61.41 <NA>
#> 2 3 58.70 <NA>
#> 3 4 56.84 108.040%
#> 4 5 55.60 105.576%
#> 5 6 54.61 104.084%
#> 6 7 53.98 103.001%
#> 7 8 53.50 102.075%
#> 8 9 53.04 101.772%
#> 9 10 52.66 101.595%
#> 10 11 52.39 101.241%
#> 11 12 52.19 100.901%
#> 12 13 52.04 100.673%
#> 13 14 51.90 100.559%
#> 14 15 51.80 100.463%
Created on 2022-04-18 by the reprex package (v2.0.1)
CodePudding user response:
In case percent change for k4 is calculated with (k4 - k2) / k4 diff
with a lag of 2 could be used:
transform(df, change = c(NA, NA, diff(df$rmse, 2)) / df$rmse * 100)
# k rmse change
#1 2 61.41 NA
#2 3 58.70 NA
#3 4 56.84 -8.0401126
#4 5 55.60 -5.5755396
#5 6 54.61 -4.0835012
#6 7 53.98 -3.0011115
#7 8 53.50 -2.0747664
#8 9 53.04 -1.7722474
#9 10 52.66 -1.5951386
#10 11 52.39 -1.2406948
#11 12 52.19 -0.9005557
#12 13 52.04 -0.6725596
#13 14 51.90 -0.5587669
#14 15 51.80 -0.4633205
In case with (k4 - k2) / k2:
transform(df, change = c(NA, NA, diff(df$rmse, 2) / head(df$rmse, -2) * 100))
# k rmse change
#1 2 61.41 NA
#2 3 58.70 NA
#3 4 56.84 -7.4417847
#4 5 55.60 -5.2810903
#5 6 54.61 -3.9232935
#6 7 53.98 -2.9136691
#7 8 53.50 -2.0325948
#8 9 53.04 -1.7413857
#9 10 52.66 -1.5700935
#10 11 52.39 -1.2254902
#11 12 52.19 -0.8925180
#12 13 52.04 -0.6680664
#13 14 51.90 -0.5556620
#14 15 51.80 -0.4611837