Home > Back-end >  How can I concatenate multiple excel sheets in a single file to one single file with an extra column
How can I concatenate multiple excel sheets in a single file to one single file with an extra column

Time:09-16

I have an excel file with over one hundred sheets. Each sheet has the same columns with information. I have concatenated all the sheets in one single file.

Script

single_file = pd.concat(pd.read_excel('multiple_sheets.xlsx',sheet_name=None),ignore_index=True)
single_file.to_csv('single_file.csv')

The script works fine. The thing is I would like to add a column that identifies each row of information with the name of its original sheet. Is that possible?

Thanks in advance, Fernando

CodePudding user response:

You'll want to save the pd.concat for the very end of the script.

You've read in the file with sheetname = None which creates a dictionary of sheet names as the key, and the dataframe as the value. Loop over each sheet and add your label column first, then concatenate.

CodePudding user response:

single_file = pd.read_excel('multiple_sheets.xlsx',sheet_name=None)
single_file = pd.concat([sheet.assign(identifier=i) for i,sheet in single_file.items()])

Here we are using the idea that all sheets can be accessed by iterating (using items()) over the dictionary of dataframes.

  • Related