I am combining multiple excel files with one to many worksheets. They each have different columns. I am only interested in combining the worksheets with address information. In the case of a worksheet with no address information, I need to make a note of it in the resulting combined file.. In the case that I am having problems, one of the worksheets has vegetables and no addresses, and the other has address information.. I am using the code below to put them together. After I get it to work I will clean standardize them and put them altogether.
dir_path <- "C:/temp/ConsigneeList/stuff4/" # target directory where the xlsx files are located.
re_file <- list.files(dir_path, pattern=".xls*") # regex pattern to match the file name format, in this case 'test1.xlsx', 'test2.xlsx' etc.
read_sheets <- function(dir_path, file){
xls_file <- paste0(dir_path, file)
xls_file %>%
excel_sheets() %>%
set_names() %>%
map_df(read_excel, path = xls_file, .id = 'sheet_name') %>%
mutate(file_name = file) %>%
select(file_name, sheet_name, everything())
}
number_of_excel_files<-length(file.list)
mybiggerlist<-vector('list',number_of_excel_files)
for(file in 1:length(mybiggerlist)) {
mybiggerlist[[file]]<-read_sheets(dir_path, file.list[file])
}
I am getting the error: Error: Can't combine Customer Quick REF$Order No
and CH Belt$Order No
. I tried to use %>% mutate_all(as.character) as the columns should all be character in nature.. Any ideas on how I could solve this? Alternatively, is there a way to skip importing the offending data and make a row indicating that there was a problem with that worksheet? Thank you!
CodePudding user response:
Try something like this:
dir_path <- "C:/temp/ConsigneeList/stuff4/" # target directory where the xlsx files are located.
re_file <- list.files(dir_path, pattern=".xls*") # regex pattern to match the file name format, in this case 'test1.xlsx', 'test2.xlsx' etc.
read_sheets <- function(dir_path, file){
xls_file <- paste0(dir_path, file)
sheets <- xls_file %>%
excel_sheets() %>%
set_names() %>% ## not really sure if this is doing anything?
map(read_excel, path = xls_file)
# Now we have all the sheets in a list.
# Time to figure out which ones to combine
# Use purrr::keep to only keep sheets that meet some condition
# I just put in a wild guess, edit the test so that only sheets
# you want are kept
sheets <- purrr::keep(sheets, ~ "Address" %in% names(.))
bind_rows(sheets, .id = 'sheet_name') %>%
mutate(file_name = file) %>%
select(file_name, sheet_name, everything())
}