Home > Enterprise >  purrr::map_dfr gives number of list element as .id argument, not value of list element
purrr::map_dfr gives number of list element as .id argument, not value of list element

Time:09-28

I need to import a list of .xls files into R. Fairly standard operation, using file.list and purrr, done several times before. For some reason I cannot use readxl package as I keep getting libxls error, so switched to XLConnect, that seems to work.

However, using the following code:

file.list <- list.files('./Raw/', pattern = '.xls', full.names = TRUE)
rws <- function(x) {XLConnect::readWorksheetFromFile(x, sheet = 1, startRow =4)}
df <- purrr::map_dfr(file.list,rws, .id = "source")

I get an output, where source column includes position of the file in the list (1,2,3,...), not name of the file. What is the problem?

CodePudding user response:

try to do it this way

file.list <- list.files('./Raw/', pattern = '.xls', full.names = TRUE) %>% 
          purrr::set_names()
rws <- function(x) {XLConnect::readWorksheetFromFile(x, sheet = 1, startRow =4)}
df <- purrr::map_dfr(file.list,rws, .id = "source")

CodePudding user response:

You can get the name of the file from the position by -

library(dplyr)
library(purrr)

df <- map_dfr(file.list,rws, .id = "source") %>%
        mutate(source = basename(file.list)[source])
        #If you don't want the extension of the filename
        #mutate(source = tools::file_path_sans_ext(basename(file.list))[source])

df
  • Related