I'm trying to write a pandas DataFrame to Excel, with dates formatted as "YYYY-MM-DD", omitting the time. Since I need to write multiple sheets, and I want to use some advanced formatting opens (namely setting the column width), I'm using an ExcelWriter
object and openpyxl
as engine.
Now, I just can't seem to figure out how to format my date column.
Starting with
import pandas as pd
df = pd.DataFrame({'string_col': ['abc', 'def', 'ghi']})
df['date_col'] = pd.date_range(start='2020-01-01', periods=3)
with pd.ExcelWriter('test.xlsx', engine='openpyxl') as writer:
df.to_excel(writer, 'test', index=False)
This will write the dates as 2020-01-01 00:00:00
. For some reason I can't understand, adding datetime_format='YYYY-MM-DD'
has no effect if openpyxl is the selected engine (works just fine if engine
is left unspecified).
So I'm trying to work around this:
with pd.ExcelWriter('test.xlsx', engine='openpyxl') as writer:
df.to_excel(writer, 'test', index=False)
writer.sheets['test'].column_dimensions['B'].width = 50
writer.sheets['test'].column_dimensions['B'].number_format = 'YYYY-MM-DD'
The column width is properly applied, but not the number formatting. On the other hand, it does work applying the style to an individual cell: writer.sheets['test']['B2'].number_format = 'YYYY-MM-DD'
.
But how can I apply the formatting to the entire column (I have tens of thousands of cells to format)? I couldn't find anything in the openpyxl documentation on how to address an entire column...
Note: I could do:
for cell in writer.sheets['test']['B']: cell.number_format = 'YYYY-MM-DD'
but my point is precisely to avoid iterating over each individual cell.
CodePudding user response:
You can treat your dates as a column of strings and slice it to get 'YYYY-MM-DD'
:
import pandas as pd
df = pd.DataFrame({'string_col': ['abc', 'def', 'ghi']})
df['date_col'] = pd.date_range(start='2020-01-01', periods=3)
df['date_col'] = df['date_col'].astype("str").str.slice(start=0, stop=10)
with pd.ExcelWriter('test.xlsx', engine='openpyxl') as writer:
df.to_excel(writer, 'test', index=False)
writer.sheets['test'].column_dimensions['B'].width = 50
CodePudding user response:
I know you are using openpyxl
engine. But if you have flexibility to switch to xlsxwriter
, I got it working using following code with help from