Home > Software engineering >  Loop among various dataframes in R
Loop among various dataframes in R

Time:12-02

I have several several dataframes(df1, df2, ..., df1000), all of them are in a nested list format and have exactly the same structure. Accessing the data for each dataframe is straight forward: just need to run the following line of code:

df1_data <- df1$CompactData$DataSet$Series

Since I have nearly a thousand dataframes, I have created a list with the names of these dfs to use them in a loop that would allow me to obtain the desired data as in the example above. However, I have not been able to find an answer to this and haven't been able to find a solution myself.

list_of_names <- list(df1, df2,, df1000)

for (df in length(list_of_names)) {
  list_of_names[[df]] = list_of_names[[df]]$CompactData$DataSet$Series
}

Essentially, what I would like to achieve is to create new dataframes with the data that I really need.

Any help will be much appreciated.

Thanks in advance!

CodePudding user response:

This is a bit messy, but you could use eval(parse()).

for(df in 1:length(list_of_names)) {
  list_of_names[[df]] = eval(parse(text = paste0(list_of_names[[df]], '$CompactData$DataSet$Series')))
}

Alternately, if you put your dataframes in a list, instead of just the names, you could use lapply() or map().

# given list of data.frames, called df_list
df_sub <- lapply(df_list, function(x) x$CompactData$DataSet$Series)

df_sub will then contain your desired subset of all data.frames in df_list. Advantages to this approach are that it is easier to manage a list of data.frames, and the code is easier to read.

CodePudding user response:

If I'm understanding your data correctly, I would first make a list of all your sequentially named df1, df2, df3, ...

start_list = mget(paste0("df", 1:1000))

Then something like your loop should work. I've changed things by (a) adding 1: in 1:length()) loop set-up, and (b) iterating over the input, not the result:

result = list()
for (df in 1:length(list_of_names)) {
  result[[df]] = start_list[[df]]$CompactData$DataSet$Series
}

If you need more help than this, please share a reproducible sample of your data (which can be admittedly difficult with nested list structures. str(df1) would be a good start...)

  • Related