I need to make a new column in each of 96 different dataframes that is the name of the dataframe (the name is informative). It's easiest to just show you what I mean.
> wolf <- data.frame(test1 = c(3,2,4,3),
test2 = c(4,5,2,4))
> bear <- data.frame(test1 = c(3,5,6,1),
test2 = c(4,6,2,4))
> wolf
test1 test2
1 3 4
2 2 5
3 4 2
4 3 4
> bear
test1 test2
1 3 4
2 5 6
3 6 2
4 1 4
I would like the output to be:
> wolf
test1 test2 animal
1 3 4 wolf
2 2 5 wolf
3 4 2 wolf
4 3 4 wolf
> bear
test1 test2 animal
1 3 4 bear
2 5 6 bear
3 6 2 bear
4 1 4 bear
Obviously, doing a dplyr::mutate
command for each dataframe would take ages. I'm sure there's a way to do this with for loops and/or lapply but I don't have a good handle on how to use those functions. I also know that it's bad practice to have so many dataframes in my global environment; I'm all ears if you have suggestions for a more organized way of inputting this data to begin with (the data is coming from excel spreadsheets).
The reason I'm doing this is I want to combine all these DFs into one DF. But if I just rbind immediately, I'll lose the important information that is in each DF's name. Thanks so much for your help.
CodePudding user response:
A possible solution, based on tibble::lst
(to create a named list of the dataframes) and purrr::imap
(to iterate over the list of dataframes):
library(tidyverse)
imap(lst(bear, wolf), ~ mutate(.x, animal = .y))
#> $bear
#> test1 test2 animal
#> 1 3 4 bear
#> 2 5 6 bear
#> 3 6 2 bear
#> 4 1 4 bear
#>
#> $wolf
#> test1 test2 animal
#> 1 3 4 wolf
#> 2 2 5 wolf
#> 3 4 2 wolf
#> 4 3 4 wolf
In the case of many dataframes to process, we can do the following, after loading all dataframes (be sure no other dataframe is loaded but only the ones needed):
# this gets all dataframes from the global environment to a named list
l <- do.call(list, eapply(.GlobalEnv, \(x) if (is.data.frame(x)) x else NULL))
l <- Filter(Negate(is.null), l)
imap(l, ~ mutate(.x, animal = .y))