Home > Back-end >  Add ID column to a list of data frames
Add ID column to a list of data frames

Time:02-18

I have a list of 142 dataframes file_content and a list from id_list <- list(as.character(1:length(file_content)))

I am trying to add a new column period to each data frame in file_content.

All data frames are similar to 2021-03-16 below.

`2021-03-16` <- file_content[[1]] # take a look at 1/142 dataframes in file_content

head(`2021-03-16`)
     author_id                created_at           id                                                                                           tweet
1 3.304380e 09 2018-12-01 22:58:55 00:00 1.069003e 18                                          @Acosta I hope he didn’t really say “muckâ€\u009d.
2 5.291559e 08 2018-12-01 22:57:31 00:00 1.069003e 18      @Acosta I like Mattis, but why does he only speak this way when Individual-1 isn't around?
3 2.195313e 09 2018-12-01 22:56:41 00:00 1.069002e 18 @Acosta What did Mattis say about the informal conversation between Trump and Putin at the G20?
4 3.704188e 07 2018-12-01 22:56:41 00:00 1.069002e 18                                                           @Acosta Good! Tree huggers be damned!
5 1.068995e 18 2018-12-01 22:56:11 00:00 1.069002e 18                                                    @Acosta @NinerMBA_01
6 9.983321e 17 2018-12-01 22:55:13 00:00 1.069002e 18                                                                                 @Acosta Really?

I have tried to add the period column using the following code but it adds all 142 values from the id_list to every row in every data frame in file_content.

for (id in length(id_list)) {
  file_content <- lapply(file_content, function(x) { x$period <- paste(id_list[id], sep = "_"); x }) 
}

CodePudding user response:

We may use imap

library(purrr)
library(dplyr)
imap(file_content, ~ .x %>% 
      mutate(period = .y))

Or with Map from base R

Map(cbind, file_content, period = names(file_content))

In the OP's code, the id_list is created as a single list element by wrapping with list i.e.

list(1:5)

vs

as.list(1:5)

Here, we don't need to convert to list as a vector is enough

id_list <- seq_along(file_content)

Also, the for loop is looping on a single element i.e. the last element with length

for (id in length(id_list)) {
            ^^

instead, it would be 1:length. In addition, the assignment should be on the single list element file_content[[id]] and not on the entire list

for(id in seq_along(id_list)) {
    file_content[[id]]$period <- id_list[id]
       
}

CodePudding user response:

You were close, the mistake is you need double brackets in id_list[[id]].

for (id in length(id_list)) {
  file_content <- lapply(file_content, function(x) {
    x$period <- paste(id_list[[id]], sep = "_")
    x
  }) 
}
# $`1`
#   X1 X2 X3 X4 period
# 1  1  4  7 10      1
# 2  2  5  8 11      2
# 3  3  6  9 12      3
# 
# $`2`
#   X1 X2 X3 X4 period
# 1  1  4  7 10      1
# 2  2  5  8 11      2
# 3  3  6  9 12      3
# 
# $`3`
#   X1 X2 X3 X4 period
# 1  1  4  7 10      1
# 2  2  5  8 11      2
# 3  3  6  9 12      3

You could also try Map() and save a few lines.

Map(`[<-`, file_content, 'period', value=id_list)
# $`1`
#   X1 X2 X3 X4 period
# 1  1  4  7 10      1
# 2  2  5  8 11      2
# 3  3  6  9 12      3
# 
# $`2`
#   X1 X2 X3 X4 period
# 1  1  4  7 10      1
# 2  2  5  8 11      2
# 3  3  6  9 12      3
# 
# $`3`
#   X1 X2 X3 X4 period
# 1  1  4  7 10      1
# 2  2  5  8 11      2
# 3  3  6  9 12      3

Data:

file_content <- replicate(3, data.frame(matrix(1:12, 3, 4)), simplify=F) |> setNames(1:3)
id_list <- list(as.character(1:length(file_content)))
  • Related