Home > Net >  Varying skiprows argument for individual worksheets when using pd.read_excel()
Varying skiprows argument for individual worksheets when using pd.read_excel()

Time:03-09

I have an excel-workbook with two worksheets which I want to import using pandas. I would like to use the skiprows argument, with 5 rows for the first sheet and 10 rows for the second sheet. pd.read_excel() returns a dictionary in case the excel file has several worksheets. Can I pass the skiprows argument as a list or similar to make it apply varying values for the individual sheets? I've tried this:

ret = pd.read_excel(file, skiprows=[5, 10], sheet_name=None)

It looks like the skiprows-argument is simply ignored in this case. Can I make it work somehow?

If yes, this can hopefully be generalized to other arguments too.

CodePudding user response:

It's complicated to generalize to all arguments. You have to iterate on each sheets.

A possible solution:

sheets = {
    'Sheet1': {'skiprows': 5},   # add other parameters
    'Sheet2': {'skiprows': 10},  # according the documentation
}

def read_excel_sheets(filename, sheets):
    dfs = {}
    with pd.ExcelFile(filename) as xlsx:
        for sheet_name, params in sheets.items():
            dfs[sheet_name] = xlsx.parse(sheet_name=sheet_name, **params)
    return dfs

# The output is the same as pd.read_excel(filename, sheets=None)
dfs = read_excel_sheets('data.xlsx', sheets)
  • Related