I have a python script that utilizes pandas to perform some aggregation on a huge dataframe and after doing so It tries to export it as an Excel "xlsx" File format.
this is the last step in the process.
print("Exporting to Excel...")
sum_df = sum_df.set_index('products_code')
with pd.ExcelWriter(OUTPUT_FILE, engine='openpyxl') as writer:
sum_df.to_excel(writer, sheet_name="stocks")
print("Done!")
The file exports normally but whenever I try to upload it to the server, the server rejects it and reads it as a zip file instead of an xlsx file, I found a quick fix for this which is to open the file in Microsoft Excel and just hit save and exit, this seems to fix the issue. But I don't know the reason for this behavior and was looking for help to automatically save it as a valid excel file from the script directly.
Any Ideas?
CodePudding user response:
As discussed in the comments, it seems using xlsxwriter
as the engine has solved this issue. E.g.:
print("Exporting to Excel...")
sum_df = sum_df.set_index('products_code')
with pd.ExcelWriter(OUTPUT_FILE, engine='xlsxwriter') as writer:
sum_df.to_excel(writer, sheet_name="stocks")
print("Done!")
It would be good to know what software was used on the server, if possible, in case other people encounter this issue.