I have an excel-workbook with two worksheets which I want to import using pandas
. I would like to use the skiprows
argument, with 5 rows for the first sheet and 10 rows for the second sheet. pd.read_excel()
returns a dictionary in case the excel file has several worksheets. Can I pass the skiprows
argument as a list or similar to make it apply varying values for the individual sheets? I've tried this:
ret = pd.read_excel(file, skiprows=[5, 10], sheet_name=None)
It looks like the skiprows
-argument is simply ignored in this case.
Can I make it work somehow?
If yes, this can hopefully be generalized to other arguments too.
CodePudding user response:
It's complicated to generalize to all arguments. You have to iterate on each sheets.
A possible solution:
sheets = {
'Sheet1': {'skiprows': 5}, # add other parameters
'Sheet2': {'skiprows': 10}, # according the documentation
}
def read_excel_sheets(filename, sheets):
dfs = {}
with pd.ExcelFile(filename) as xlsx:
for sheet_name, params in sheets.items():
dfs[sheet_name] = xlsx.parse(sheet_name=sheet_name, **params)
return dfs
# The output is the same as pd.read_excel(filename, sheets=None)
dfs = read_excel_sheets('data.xlsx', sheets)