Home > OS >  Collecting Sheet Names from a Folder of Excel Workbooks
Collecting Sheet Names from a Folder of Excel Workbooks

Time:10-31

I have a folder of numerous .xslx files on my desktop and I am trying to iterate through them one by one to collect the respective sheet names automatically from each workbook.

import openpyxl
import glob


# specifying the path to csv files

path = "C:/Users/X/Desktop/Test"
  
# csv files in the path

files = glob.glob(path   "/*.xlsx")

sheet_names = []

for x in files:
    openpyxl.load_workbook(files)
    sheet_names.append(files.sheetnames)

I am getting an error code:

TypeError: expected str, bytes or os.PathLike object, not list

Is there any way to do this iteratively versus one by one if I have all of the workbook names in a list?

Thank you.

I am looking for the sheet names in each respective Excel workbook file.

CodePudding user response:

Try:

sheet_names = []

for x in files:
    sheet_names.append(openpyxl.load_workbook(x).sheetnames)

Or, using a list comprehension:

sheet_names = [openpyxl.load_workbook(x).sheetnames for x in files]

Explanation error

The error occurs because you are referencing files (i.e. a list), rather than the items inside the list (i.e. x) in this line:

    openpyxl.load_workbook(files) # we need `x` here instead of `files`

Next, in order to actually use the result from the function call, you should have assigned it to a variable, e.g. wb = openpyxl.load_workbook(x) and then pass it to the list.append function. I.e.:

    sheet_names.append(files.sheetnames) # we would need `wb.sheetnames` here

However, as shown above, you can just put all of this is a one-liner (so, skipping the intermediate assignment to wb).

CodePudding user response:

See: Pandas: Looking up the list of sheets in an excel file:

import pandas as pd
import os

excelfolder = r'/home/bera/Downloads/'

for root, folder, files in os.walk(excelfolder):
    for file in files:
        if file.endswith('.xlsx'):
            print(file)
            fullname = os.path.join(root, file)
            print(pd.ExcelFile(fullname).sheet_names)
            
            #Excelfile123.xlsx
            #['Sheet1', 'Sheet2', 'Sheet3']
            
            #excelfile2.xlsx
            #['A', 'B']
            
  • Related