I have 15 dataframes, that I have merged together.
- Here I'm loading my files.
data_files <- list.files() # Identify file names
for(i in 1:length(data_files)) {
assign(paste0(substr(data_files[i],1,nchar(data_files[i])-4)),
read_excel(paste0("",
data_files[i])))
}
Problem, they have the same columns names. That's why i want to rename the columns with the following code :
colnames(COMMUNITY)
[1] "OBJECTID" "SOURCE_ID" "mean" "LMiIndex Fixed 450000 RS"
[5] "LMiZScore Fixed 450000 RS" "LMiPValue Fixed 450000 RS" "COType Fixed 450000 RS" "NNeighbors Fixed 450000 RS"
[9] "ZTransform Fixed 450000 RS" "SpatialLag Fixed 450000 RS"
colnames(COMMUNITY) <-paste("PREFIX",colnames(COMMUNITY),sep="-")
I would like to do this to my 15 dataframes, so I tried this :
List_df_EU = list(COMMUNITY,CSR_STRATEGY, EMISSIONS,ENV_PILLAR,ESGCOMBINED,ESGCONTROVERSIES,
ESGSCORE,GOV_PILLAR,HUMANRIGHTS,INNOVATION,MANAGEMENT,PRODUCT_RESP, RESSOURCE_USE, SOC_PILLAR, WORKFORCE)
for(i in 1:length(List_df_EU)) {
colnames(List_df_EU[i]) <-paste("AS",colnames(List_df_EU[i]),sep="_")
}
It doesn't work, and, I don't know how to retrieve dataframe name, in order to put him as PREFIX of columns.
I could do it for each dataframe seperately, but it would take a long time, and would not be very clever.
Even after many web researches, I never found something that was automated.
After that, I use the following line of code to merge, It actually works, but as expected every colnames are identical.
Merged_file <- purrr::reduce(List_df_EU, dplyr::left_join, by = 'OBJECTID', suffix = c(".x", ".y"))
CodePudding user response:
First, refer to elements of the list with double square brackets, like so List_df_EU[[i]]
(List_df_EU[i]
is a sub-list of 1 element, not the element itself).
Second, we could create List_df_EU
with tibble::lst()
instead of list()
, so that elements are automatically named. Then, "AS"
can be replaced with names(List_df_EU)[i]
.
List_df_EU <- tibble::lst(....)
for(i in 1:length(List_df_EU)) {
colnames(List_df_EU[[i]]) <- paste(
names(List_df_EU)[i], colnames(List_df_EU[[i]]), sep = "_")
}
Edit
To allow the subsequent join on OBJECTID
, we could rename all columns but OBJECTID
, for instance using dplyr
that has a nice interface for this:
for(i in 1:length(List_df_EU)) {
List_df_EU[[i]] <- dplyr::rename_with(
List_df_EU[[i]],
~ paste(names(List_df_EU)[i], .x, sep = "_"),
.cols = - OBJECTID
)
}
CodePudding user response:
The easiest may be to bring them all into the same columns but add a column that indicates what file they came from. You could also pivot_wider
and separate them again, at that point.
This function is for filling in the column that will be used to identify the source file.
library(tidyverse)
library(data.table)
add_name <- function(flnm) {
fread(flmn) %>%
mutate(filename = basename(flmn))
}
Use this to collect the files and build the data frame.
mergedDF <- list.files(urlOrObject) %>%
map_df(~add_name(.))
Let me know if you have any questions.
CodePudding user response:
Name your list, then you can get the name prefix:
List_df_EU = list(COMMUNITY = COMMUNITY,CSR_STRATEGY = CSR_STRATEGY ...)
to set the colnames there is a [] missing:
colnames(List_df_EU[[i]]) <- ...