Home > Back-end >  Pandas - Loop through sheets
Pandas - Loop through sheets

Time:11-01

I have 5 sheets and created a script to do numerous formatting, I tested it per sheet, and it works perfectly.

import numpy as np
import pandas as pd
           
FileLoc = r'C:\T.xlsx'
Sheets = ['Alex','Elvin','Gerwin','Jeff','Joshua',]

df = pd.read_excel(FileLoc, sheet_name= 'Alex', skiprows=6) 
df = df[df['ENDING'] != 0]
df = df.head(30).T
df = df[~df.index.isin(['Unnamed: 2','Unnamed: 3','Unnamed: 4','ENDING' ,3])]
df.index.rename('STORE', inplace=True)
df['index'] = df.index

df2 = df.melt(id_vars=['index', 2 ,0, 1] ,value_name='SKU' )
df2 = df2[df2['variable']!= 3]

df2['SKU2'] = np.where(df2['SKU'].astype(str).fillna('0').str.contains('ALF|NOB|MET'),df2.SKU, None)
df2['SKU2'] = df2['SKU2'].ffill()
df2 = df2[~df2[0].isnull()]
df2 = df2[df2['SKU'] != 0]

df2[1] = pd.to_datetime(df2[1]).dt.date
df2.to_excel(r'C:\test.xlsx', index=False)

but when I assigned a list in Sheet_name = Sheets it always produced an error KeyError: 'ENDING'. This part of the code:

Sheets = ['Alex','Elvin','Gerwin','Jeff','Joshua',]
df = pd.read_excel(FileLoc,sheet_name='Sheets',skiprows=6) 

Is there a proper way to do this, like looping?

My expected result is to execute the formatting that I have created and consolidate it into one excel file.

NOTE: All sheets have the same format.

CodePudding user response:

In using the read_excel method, if you give the parameter sheet_name=None, this will give you a OrderedDict with the sheet names as keys and the relevant DataFrame as the value. So, you could apply this and loop through the dictionary using .items().

The code would look something like this,

dfs = pd.read_excel('your-excel.xlsx', sheet_name=None)
for key, value in dfs.items():
  # apply logic to value

If you wish to combine the data in the sheets, you could use .append(). We can append the data after the logic has been applied to the data in each sheet. That would look something like this,

combined_df = pd.DataFrame()
dfs = pd.read_excel('your-excel.xlsx', sheet_name=None)
for key, value in dfs.items():
  # apply logic to value, which is a DataFrame
  combined_df = combined_df.append(sheet_df)
  • Related