How to do rolling percent change on non-time series in R?-CodePudding

I have a dataframe with the following sample:

df = data.frame(x1 = c(2000a,2010a,2000b,2010b,2000c,2010c), 
x2 = c(1,2,3,4,5,6))

I am trying to find a way to calculate the percent change for each "group" (a,b,c) using the change() function. Below is my attempt:

percent_change = change(df,x2, NewVar = "percent_change", slideBy = 1,type = 'percent')

where slideBy is the lag variable that restarts the percent change calculation every other observation. This does not work, and I get the following error:

" Remember to put data in time order before running.

Leading total_units by 1 time units."

Would it be possible to adapt my x1 column to a time series or is there an easier way around this I am missing?

Thank you!

CodePudding user response：

This uses the data.table structure from the data.table package. First it sorts on x1, then does a row by row calculation of the percent change, grouping by the letter in x1.

library(data.table)
setDT(df)

df[order(x1), 
   100*x2/shift(x2,1L), 
   keyby=gsub("[0-9]","",x1)]

CodePudding user response：

Here is a tidyverse way to do this. First, use extract to separate x1 into year and group, then pivot_wider on the table. Now you can use mutate to create the percent change row.

library(dplyr)
library(tidyr)

df = data.frame(x1 = c("2000a","2010a","2000b","2010b","2000c","2010c"),x2 = c(1,2,3,4,5,6))

df_new = df %>%
  extract(x1, c("year", "group"),regex="(\\d{4})(\\D{1})") %>%
  pivot_wider(names_from = year, values_from=x2) %>%
  mutate(percent_change=(`2010`-`2000`)/`2000`)