Home > database >  Looping over various lists of data frames to create a variable in each data frame
Looping over various lists of data frames to create a variable in each data frame

Time:12-23

I have four lists, each of which contains 12 data frames. Something like this: `

for (i in 1:10) {
assign(paste0("df", i), data.frame(c=c(1,2,3), d=c(1,2,3), 
                                   e=c(1,2,3), e=c(1,2,3),
                                   g=c(1,2,3), h=c(1,2,3), 
                                   i=c(1,2,3), j=c(1,2,3),
                                   k=c(1,2,3), l=c(1,2,3)))
}

for (i in 1:4) {
  assign(paste0("list_", i), lapply(ls(pattern="df"), get))
}


rm(list=ls(pattern="df"))

`

Each list corresponds to a year and each of its elements (the data frames) correspond to a month. Conveniently, the position of each element of the list (of each data frame) is equivalent to its month. So, in the first list, the first data frame corresponds to January 2020, the second to February 2020, and so on. In the second list, the first data frame corresponds to January 2021, the second to February 2021, and so on.

What I need to do is to create a new variable that indicates the month of each data frame.

I have been trying different things, including this:

for(j in 20:21) {
for(i in 1:12) {
assign(get(paste0("df_20", j))[[i]], 
       get(paste0("df_20", j))[[i]] %>% 
       mutate(month=i)) ## se le suma 1 al mes porque comienza desde febrero
}
}

But nothing works. The problem seems to be the left hand of the assignment. When I use the get() function, the software returns an error ("Error in assign(get(paste0("mies_20", j))[[1]], get(paste0("mies_20", : invalid first argument"). If I don't include this function, paste0("df_20", j))[[i]] does not recognize the "[[i]]".

Any ideas?

CodePudding user response:

get() each list, iterate over its dataframes using lapply(), then assign back to the environment. It’s also best to use something other than i for your iteration variable, since there’s also a column i in your dataframes.

library(dplyr)

for (j in 1:4) {
  list_j <- get(paste0("list_", j))
  list_j <- lapply(
    seq_along(list_j),
    \(mnth) mutate(list_j[[mnth]], month = mnth)
  )
  assign(paste0("list_", j), list_j)
}

list_1
[[1]]
  c d e f g h i j k l month
1 1 1 1 1 1 1 1 1 1 1     1
2 2 2 2 2 2 2 2 2 2 2     1
3 3 3 3 3 3 3 3 3 3 3     1

[[2]]
  c d e f g h i j k l month
1 1 1 1 1 1 1 1 1 1 1     2
2 2 2 2 2 2 2 2 2 2 2     2
3 3 3 3 3 3 3 3 3 3 3     2

[[3]]
  c d e f g h i j k l month
1 1 1 1 1 1 1 1 1 1 1     3
2 2 2 2 2 2 2 2 2 2 2     3
3 3 3 3 3 3 3 3 3 3 3     3

[[4]]
  c d e f g h i j k l month
1 1 1 1 1 1 1 1 1 1 1     4
2 2 2 2 2 2 2 2 2 2 2     4
3 3 3 3 3 3 3 3 3 3 3     4

[[5]]
  c d e f g h i j k l month
1 1 1 1 1 1 1 1 1 1 1     5
2 2 2 2 2 2 2 2 2 2 2     5
3 3 3 3 3 3 3 3 3 3 3     5

[[6]]
  c d e f g h i j k l month
1 1 1 1 1 1 1 1 1 1 1     6
2 2 2 2 2 2 2 2 2 2 2     6
3 3 3 3 3 3 3 3 3 3 3     6

[[7]]
  c d e f g h i j k l month
1 1 1 1 1 1 1 1 1 1 1     7
2 2 2 2 2 2 2 2 2 2 2     7
3 3 3 3 3 3 3 3 3 3 3     7

[[8]]
  c d e f g h i j k l month
1 1 1 1 1 1 1 1 1 1 1     8
2 2 2 2 2 2 2 2 2 2 2     8
3 3 3 3 3 3 3 3 3 3 3     8

[[9]]
  c d e f g h i j k l month
1 1 1 1 1 1 1 1 1 1 1     9
2 2 2 2 2 2 2 2 2 2 2     9
3 3 3 3 3 3 3 3 3 3 3     9

[[10]]
  c d e f g h i j k l month
1 1 1 1 1 1 1 1 1 1 1    10
2 2 2 2 2 2 2 2 2 2 2    10
3 3 3 3 3 3 3 3 3 3 3    10
  • Related