I have 10 Excel files (they have same number of columns, and varying number of rows)
I need to append data from those 10 files into one single Excel file using Openpyxl Python library
Read data from File1, append it to new_file Read data from File2, append it to new_file ...
Is this possible? Can anyone help me?
Thank you
CodePudding user response:
There are some missing details in the question, as raised by @moken. Let's make some assumptions that all files have a single sheet named 'Sheet 1' and identical column headers. And the final output will start with file10's content, then file9 etc and we will skip copying the column headers.
For the sake of simplicity, we will use 3 files' content for illustration:
file1.xlsx: | col_1 | col_2 | col_3 | | ----- | ----- |------ | | F1-1 | F1-2 | F1-3 | file2.xlsx: | col_1 | col_2 | col_3 | | ----- | ----- |------ | | F2-1 | F2-2 | F2-3 | | F2-2 | F2-3 | F2-4 | file3.xlsx: | col_1 | col_2 | col_3 | | ----- | ----- |------ | | F3-1 | F3-2 | F3-3 | | F3-2 | F3-3 | F3-4 | | F3-3 | F3-4 | F3-5 |
The code is rather straightforward, where we get all rows from the current file and append row by row to the main workbook:
from openpyxl import load_workbook
main_workbook = load_workbook(filename="file3.xlsx")
file_list = ["file2.xlsx","file1.xlsx"]
for file in file_list:
workbook = load_workbook(filename=file)
new_rows = list(workbook['Sheet1'].values)
for idx,row in enumerate(new_rows):
# skip column header
if idx == 0: continue
main_workbook['Sheet1'].append(row)
workbook.close()
main_workbook.save("merged.xlsx")
The final output would have rows with the following values:
>>> list(main_workbook['Sheet1'].values)
[('col_1', 'col_2', 'col_3'),
('F3-1', 'F3-2', 'F3-3'),
('F3-2', 'F3-3', 'F3-4'),
('F3-3', 'F3-4', 'F3-5'),
('F2-1', 'F2-2', 'F2-3'),
('F2-2', 'F2-3', 'F2-4'),
('F1-1', 'F1-2', 'F1-3')]