Whenever I have several datasets that I want to load and combine into a single dataset, so that I can analyse all of them at the same time, I do something like
for(i in 1:length(dataset_paths))
{
data_path <- list.files(path = paste0(here(dataset_paths[i]), "results/analysis/"), pattern = ".*\\.degradation_dynamics\\.csv", full.names = TRUE)
t_dt <- fread(data_path)
if(i == 1)
{
dt <- t_dt
}
dt <- rbind(dt, t_dt)
}
rm(t_dt)
dt %>% setkey(frame)
This code is quite ugly.
Is there a way to make this code more smaller & more understandable?
For instance by
- Getting rid of the if condition inside the loop
- Getting rid of the
1:length(dataset_paths)
expression - Getting rid of defining a temporary variable
t_dt
?
CodePudding user response:
Be aware that the following code is not tested, but using lapply
you do something like:
dt <- lapply(dataset_paths, function(x) {
data_path <- list.files(path = here(x, "results", "analysis"), pattern = ".*\\.degradation_dynamics\\.csv", full.names = TRUE)
fread(data_path)
})
dt <- do.call(rbind, dt)
dt %>% setkey(frame)
CodePudding user response:
I like foreach
for this purpose:
require(data.table)
require(foreach)
dt <- foreach( path = dataset_paths, .combine='rbind' ) %do% {
dt_t <- fread(path)
}
setkey(dt,frame)