I've been trying, for a while now to solve this formatting problem for work, but to no avail.
I'll try to be brief and concise: I am writing a Python script as part of an automation job. The script is expected to work with two .csv file as input, downloaded from the client's Database. The .csv files are obtained elswhere through an automation software. After that, I have basically read the main .csv (csv_dataframe) as a DataFrame with headers, and the other one (csv_depara_espec) with no headers; then I edited the main csv so I could alter the columns positions; then filled a column with translated information from the secondary csv. Up to this point, all was done using Pandas. Afterwards, I've created an excel_dataframe using xlsxwriter as engine, then I've created some formatting objects, and then I've applied them to specific columns within the excel_dataframe.
Here is the code:
import os
from typing import Self
import pandas as pd
import pandas.io.excel
import numpy as np
import datetime
from datetime import datetime
from datetime import date
import time
import xlsxwriter
template_excel_file = r"C:\CriarTabelaOpme\Modelo Material Alto Custo - Intranet.xlsx"
depara_nome_espec_file = r"C:\CriarTabelaOpme\Especialidade_Dicionario.csv"
report_csv_file = r"C:\CriarTabelaOpme\ReportServiceIntranet.csv"
csv_dataframe = pd.read_csv(report_csv_file, sep = ',', encoding = "ISO-8859-1", engine = 'python', index_col = None, names = ['TIPO', 'CODIGO', 'PTU',
'DESCRICAO', 'FORNECEDOR', 'VALOR', 'COD_PRINCP_ATIVO', 'PRINCIPIO_ATIVO', 'ANVISA', 'VALIDADE_RMS', 'FABRICANTE', 'DT_ATUALIZACAO', 'PTU_LIMITE', 'COD_ESP', 'NOME_ESPEC', 'REFERENCIA', 'OBSERVACAO'])
csv_dataframe.insert(16, "", "")
csv_depara_espec = pd.read_csv(depara_nome_espec_file, sep = ',', header = None, encoding = "ISO-8859-1", engine = 'python')
#print(csv_depara_espec)
csv_dataframe = csv_dataframe.iloc[:, [0,1,2,3,4,5,6,7,8,9,10,11,12,13,16,14,15]]
#print(csv_dataframe)
for row in range(len(csv_dataframe)):
cod_esp_row = (csv_dataframe.iloc[row, 13])
csv_dataframe.iloc[row,14] = csv_depara_espec.iloc[cod_esp_row, 1]
writer = pd.ExcelWriter(template_excel_file, engine = 'xlsxwriter', date_format = 'DD/MM/YYYY')
excel_dataframe = csv_dataframe.to_excel(writer, sheet_name = 'Material Alto Custo', index = False, header = ['TIPO', 'CODIGO', 'PTU',
'DESCRICAO', 'FORNECEDOR', 'VALOR', 'COD_PRINCP_ATIVO', 'PRINCIPIO_ATIVO', 'ANVISA', 'VALIDADE_RMS', 'FABRICANTE', 'DT_ATUALIZACAO',
'PTU_LIMITE', 'COD_ESP', 'NOME_ESPEC', 'REFERENCIA', 'OBSERVACAO'])
(max_row, max_col) = csv_dataframe.shape
workbook = writer.book
worksheet = writer.sheets['Material Alto Custo']
header_format = workbook.add_format({'bold' : True,
'font' : 'Arial',
'size' : 10})
font_and_size = workbook.add_format({'font' : 'Arial', 'size' : 10})
column_valor_format = workbook.add_format({'num_format': '[$R$-pt-BR] #.##0,00'})
column_date_format = workbook.add_format({'num_format' : 'dd/mm/yyyy'})
column_left_zeroes_format = workbook.add_format({'num_format' : '00000000'})
worksheet.set_row(0, None, header_format)
worksheet.set_column(0,max_col, 20.0, font_and_size)
worksheet.set_column(1, 1, 20.0, column_left_zeroes_format)
worksheet.set_column(2, 2, 20.0, column_left_zeroes_format)
worksheet.set_column(5, 5, 20.0, column_valor_format)
worksheet.set_column(9, 9, 20.0, column_date_format)
worksheet.set_column(11, 11, 20.0, column_date_format)
worksheet.set_column(12, 12, 20.0, column_date_format)
for i, j in enumerate(list(csv_dataframe.columns)):
worksheet.write(0, i, j, header_format)
#Fechar planilha
writer.close()
All in all, formatting the header, every colmun with font and size, and columns B and C (indexes 1 and 2) with a zero mask to the left of the number works perfectly, but the formatting of values (currency, Brazilian Real) and dates do not.
This is the resulting formatting after code excecution:
[![enter image description here][1]][1]
And this is the expected result:
[![enter image description here][2]][2]
I've obtained the expected results by manually double clicking each cell in every row and then pressing enter. First four rows where manually altered to demonstrate the desired effects. Altered columns are F, J, M and M.
I reckon this must be due to locale formatting of Excel files (Brazilian Portuguese opposed to United States English, which is the format region on the VM I use at work). Is there anything to be done regarding this problem?
Thank you so much in advance!
CodePudding user response:
There are a few issues here.
For the question in the title on the locale, you will need to use the US style thousands and decimal point values "," and "." instead of the locale variant. In this case '[$R$-pt-BR] #,##0.00'
instead of '[$R$-pt-BR] #.##0,00'
. This is explained in more detail in the