I have a list of dataframes that look like this>
crops_1990.tempor <- data.frame(study_unit=c("unit1", "unit2", "unit3"),
cropp=c("crop1", "crop2", "crop3"),
area=c(1,2,3),
year=NA)
crops_1991.tempor <- data.frame(study_unit=c("unit1", "unit2", "unit3"),
cropp=c("crop1", "crop2", "crop3"),
area=c(4,5,6),
year=NA)
crops_1992.tempor <- data.frame(study_unit=c("unit1", "unit2", "unit3"),
cropp=c("crop1", "crop2", "crop3"),
area=c(7,8,9),
year=NA)
df_list <- list(crops_1990.tempor, crops_1991.tempor, crops_1992.tempor)
I would like to fill the column 'year' with the year information that is in the name of each df within the list (1990, 1991 and 1992, respectively in this example).
I thought it would be very easy but I'm struggling a lot!
I've tried stuff like:
df_list <- lapply(df_list, function(x) {x$year <- as.character(x$year); x})
df_list <- lapply(df_list, function(x) {x$year <- substring(names(df_list), 7,10); x}) # add years from object name in list
but nothing seems to work. My expected result would be the dataframes within the list looking like this:
crops_1990.tempor <- data.frame(study_unit=c("unit1", "unit2", "unit3"),
cropp=c("crop1", "crop2", "crop3"),
area=c(1,2,3),
year=c("1990", "1990", "1990"))
crops_1991.tempor <- data.frame(study_unit=c("unit1", "unit2", "unit3"),
cropp=c("crop1", "crop2", "crop3"),
area=c(4,5,6),
year=c("1991", "1991", "1991"))
crops_1992.tempor <- data.frame(study_unit=c("unit1", "unit2", "unit3"),
cropp=c("crop1", "crop2", "crop3"),
area=c(7,8,9),
year=c("1992", "1992", "1992"))
CodePudding user response:
Using tidyverse
(lst
names the list automatically*) you could do:
library(tidyverse)
lst(crops_1990.tempor, crops_1991.tempor, crops_1992.tempor) |>
imap(~ .x |> mutate(year = .y |> str_extract("\\d ")))
Alternatively, you could put all of the objects of your environment containing crops_
into a list using mget
and ls
(faster if you have many data frames!):
mget(ls(pattern = "crops_")) |>
imap(~ .x |> mutate(year = .y |> str_extract("\\d ")))
Output:
$crops_1990.tempor
study_unit cropp area year
1 unit1 crop1 1 1990
2 unit2 crop2 2 1990
3 unit3 crop3 3 1990
$crops_1991.tempor
study_unit cropp area year
1 unit1 crop1 4 1991
2 unit2 crop2 5 1991
3 unit3 crop3 6 1991
$crops_1992.tempor
study_unit cropp area year
1 unit1 crop1 7 1992
2 unit2 crop2 8 1992
3 unit3 crop3 9 1992
NB! You should consider to putting your data into a list in the first place when you load your data. See e.g. on why: How do I make a list of data frames?
(*) One of the reasons why your approach isn't working is that the list is not named.
CodePudding user response:
Another potential way is:
## Creating list of dataframes
df_list <- list(crops_1990.tempor, crops_1991.tempor, crops_1992.tempor)
## Getting the name of all dataframes stored in R's global environment
names_of_dataframes <- ls.str(mode = "list")
## Inserting the values in Year column
for (i in 1:length(names(which(unlist(eapply(.GlobalEnv,is.data.frame)))))) {
df_list[[i]]$year = as.numeric(str_extract_all(names(which(unlist(eapply(.GlobalEnv,is.data.frame))))[i], "[0-9] "))
}
## Unlisting all dataframes from the df_list
for (i in seq(df_list))
assign(names(which(unlist(eapply(.GlobalEnv,is.data.frame))))[i], df_list[[i]])
Output
> crops_1990.tempor
study_unit cropp area year
1 unit1 crop1 1 1990
2 unit2 crop2 2 1990
3 unit3 crop3 3 1990
> crops_1991.tempor
study_unit cropp area year
1 unit1 crop1 7 1991
2 unit2 crop2 8 1991
3 unit3 crop3 9 1991
> crops_1992.tempor
study_unit cropp area year
1 unit1 crop1 4 1992
2 unit2 crop2 5 1992
3 unit3 crop3 6 1992