I need to import a list of .xls files into R. Fairly standard operation, using file.list and purrr, done several times before. For some reason I cannot use readxl package as I keep getting libxls error, so switched to XLConnect, that seems to work.
However, using the following code:
file.list <- list.files('./Raw/', pattern = '.xls', full.names = TRUE)
rws <- function(x) {XLConnect::readWorksheetFromFile(x, sheet = 1, startRow =4)}
df <- purrr::map_dfr(file.list,rws, .id = "source")
I get an output, where source
column includes position of the file in the list (1,2,3,...), not name of the file. What is the problem?
CodePudding user response:
try to do it this way
file.list <- list.files('./Raw/', pattern = '.xls', full.names = TRUE) %>%
purrr::set_names()
rws <- function(x) {XLConnect::readWorksheetFromFile(x, sheet = 1, startRow =4)}
df <- purrr::map_dfr(file.list,rws, .id = "source")
CodePudding user response:
You can get the name of the file from the position by -
library(dplyr)
library(purrr)
df <- map_dfr(file.list,rws, .id = "source") %>%
mutate(source = basename(file.list)[source])
#If you don't want the extension of the filename
#mutate(source = tools::file_path_sans_ext(basename(file.list))[source])
df