Home > Enterprise >  Dynamically creating nested FOR loops based on the list length in R
Dynamically creating nested FOR loops based on the list length in R

Time:11-08

Suppose I have a dataset dt like this:

meta_cat cat sku price sales
bakery bread 796590 22.6 24
bakery bread 796595 19.8 20
bakery doughnut 796588 30.6 36
bakery sandwich 796640 45.9 42
bakery sandwich 796643 43.3 45
fruits feijoa 645342 97.2 5
fruits orange 645675 35.7 78
fruits orange 645677 43.9 65
fruits feijoa 645342 92.9 11

Also, I have a list which looks like this, for example:

lvl_list <- list(c("meta_cat"),   
                 c("cat"))

I don’t know in advance how many levels there will be in the list (list length can be either 0 (empty list), or one, two, three, etc. (in our example, there are two levels)). List values correspond to the columns names from the dataset.

My task is to run the nested for loops based on the length of the list.

If the list is empty, the loop does not start and the main code is executed.
If the list length = 1, there should be 1 for loop like this:

for(i in unique(dt[[lvl_list[[1]]]])){  
    dt <- dt[get(lvl_list[[1]]) == I,] # make subset     
       # run main code   
       # .
       # .
       # main code
     }
   }

So, at the first iteration, we filter the dt by the first unique value of the meta_cat column (for example, choose only records where meta_cat = "bakery") and run main code on this dt.

If the length of the list = 2, we should get 2 for loops:

for(i in unique(dt[[lvl_list[[1]]]])){
     dt <- dt[get(lvl_list[[1]]) == i, ] # filter dt
    
     for(j in unique(dt[[lvl_list[[2]]]])){
       dt <- dt[get(lvl_list[[2]]) == j, ] # filter dt again
       # run main code   
       #  .   
       #  .   
       # main code   
     }
   }

So, here we filter dt by values of two columns. There are two unique values ​​for variable meta_cat and 5 unique values ​​for cat variable.
The logic of code execution should be as follows: at the first iteration, we filter the dt by the first value of meta_cat (leaving in dt observations, where meta_cat = "bakery"), at the first iteration of the second loop, we filter the dt by the first value of cat variable (we will choose observations where cat = "bread"). So, we obtain dt where meta_cat = "bakery" and cat = "bread". Further, this filtered dt is used as an input for the modelling code. On the second iteration, the original dt is filtered by meta_cat = "bakery", and cat = "doughnut". Then the main code is executed for this dt, end so on.

If there are 3 levels in the list, we should have 3 for loops, etc.

My question: is it possible to create nested for loops dynamically, based on the list length?
I would be grateful for any help how it can be implemented.

CodePudding user response:

It may be easier with split

lst1 <- lapply(split(dt, dt[[lvl_list[[1]]]]), function(x) 
         split(x, x[[lvl_list[[2]]]]))

Also, as this is a recursive split, use rsplit from collapse, which by default does recursive split and returns the nested list`

library(collapse)
lst2 <- rsplit(dt, by = dt[, unlist(lvl_list), with = FALSE])

data

dt <- structure(list(meta_cat = c("bakery", "bakery", "bakery", "bakery", 
"bakery", "fruits", "fruits", "fruits", "fruits"), cat = c("bread", 
"bread", "doughnut", "sandwich", "sandwich", "feijoa", "orange", 
"orange", "feijoa"), sku = c(796590L, 796595L, 796588L, 796640L, 
796643L, 645342L, 645675L, 645677L, 645342L), price = c(22.6, 
19.8, 30.6, 45.9, 43.3, 97.2, 35.7, 43.9, 92.9), sales = c(24L, 
20L, 36L, 42L, 45L, 5L, 78L, 65L, 11L)), row.names = c(NA, -9L
), class = c("data.table", "data.frame"))
  • Related