Home > Software engineering >  merging sheets with same name skipping files without that sheet and adding column with the file'
merging sheets with same name skipping files without that sheet and adding column with the file'

Time:06-29

i am quite new in R, so please don't be hangry if my question is quite stupid for you :)

Basically, I have different .xlsx file in a folder with the same structure in terms of sheets, but in some cases the same sheet is not present in all the files.

While when the sheet is always present, I run the code below, adding a column with the name of the file:

library(readxl)
library(tidyr)
myfiles<-list.files(pattern = "*.xlsx")
myfiles
test = lapply (myfiles, function(i){
x = read_excel(i, sheet='sheet_1')  
    x$file = i
    x
})

when the sheet is not present in all the files the code above does not work, detecting error due to missing sheet. I run the code below, but i dont manage to add the columun with the name of the file, as i manage above.

library(readxl)
library(tidyr)
all_files <- list.files(pattern = '*.xlsx')
result <- lapply(all_files, function(x) {
    all_sheets <- excel_sheets(x)  
    correct_sheet <- intersect(all_sheets, c('Sheet_2’)
    if(length(correct_sheet)) read.xlsx(x,correct_sheet)
})
prova<-do.call(rbind, result)

Someone could help me?? thanks a lot in advance!

CodePudding user response:

You can try try().

library(readxl)
library(tidyr)


myfiles<-list.files(pattern = "*.xlsx")

test = lapply (myfiles, function(i) try({
x = read_excel(i, sheet='sheet_1')  
x$file = i
x
},silent=TRUE))

Here, the try() function is used because it stops your code from quitting and throwing an error when it hits one. It'll look through the files and "try" to do as you're hoping. I use silent=TRUE when I don't want the error log that you get if you accept the default setting.

From there, you'll see that your list has some funky items with error-related text, but also that the data from the sheets you specified is coming through. If you want to get to only the data from the sheets you've successfully pulled data in from, you can do something like:

test<-Filter(function(x) is.tbl(x),test)

What you're doing here is taking advantage of the Filter() function, and filtering test down so that only the tibbles pulled from the sheets in your excel files are preserved.

Optionally, if you want to get all your data into a single tibble, I suggest using bind_rows()

test<-bind_rows(test)
  •  Tags:  
  • r
  • Related