Home > Enterprise >  Avoid for loop using data.table
Avoid for loop using data.table

Time:10-13

I have a simulation over time (dev_quarters) that looks like this, which is a data.table :

simulation <- data.table(`Scenario ID` = 1, dev_quarter = seq(1:80), brand = 1, proportion = runif(80))

For each scenario, we have n_brand, n_scenario and a proportion.

I try to code the following : for each scenario, for each brand, compute the difference of the proportion between the beginning and the end of the year, for each year.

I made the following to recover the corresponding dev_quarters for each year :

x <- 2002:2021
lookup_T <- as.integer(format(Sys.Date(), "%Y"))
lookup_period <- data.table(years = lookup_T-x 1, quarters_t = (lookup_T-x 1)*4, quarters_t1 = (lookup_T-x 2)*4)

With a small example

n_scenario <- 1
n_brand <- 10

An ugly code that uses for loops :

result <- data.table(`Scenario ID` = numeric(), years = numeric(), brand = numeric(), proportion = numeric())

for(i in 1:n_scenario){
  for(j in 1:n_brand){
    
    prop_per_year <- c()
    # for each year
    for(k in 1:length(x)){
      
      year <- lookup_period[k, ]
      quarter_start_year <- year[["quarters_t"]]
      quarter_end_year <- year[["quarters_t1"]]
      
      end_year_prop <- simulation[`Scenario ID`==i & brand==j & dev_quarter==quarter_end_year]
      start_year_prop <- simulation[`Scenario ID`==i & brand==j & dev_quarter==quarter_start_year]
      
      prop_this_year <- max(end_year_prop[["proportion"]] - start_year_prop[["proportion"]], 0)
      
      prop_per_year <- append(prop_per_year, prop_this_year)
    }
    
    result_temp <- data.table(`Scenario ID` = i, years = x, brand = j, proportion = prop_per_year)
    
    result <- rbind(result, result_temp)

  }
}

I considered to filter my data.table, using only rows were dev_quarters were 4k factors, but the issue remains the same about the for loops. How can I avoid them using data.table ?

Thanks.

CodePudding user response:

The absolute change in proportion between the 4th and 1st quarter can be calculated much more easily.

simulation[, year := 2002   (dev_quarter-1) %/% 4]  # Easier way to calculate the year
simulation[, .(change = last(proportion) - first(proportion)), by = c("Scenario ID", "brand", "year")
  • Related