When I read a date say '01/12/2020'
, which is in the format dd/mm/yyyy
, with pd.to_datetime()
, it detects the month as 01
.
pd.to_datetime('01/12/2020').month
>> 1
But this behavior is not consistent.
When we create a dataframe with a column containing dates in this format, and convert using the same to_datetime
function, it then detects 12
as the month.
tt.dt.month[0]
>> 12
What could be the reason ?
CodePudding user response:
pandas automagically tries to detect the date format, which can be very nice, or annoying in your case.
Be explicit, use the dayfirst
parameter:
pd.to_datetime('01/12/2020', dayfirst=False).month
# 1
pd.to_datetime('01/12/2020', dayfirst=True).month
# 12
Example of ambiguous use:
tt = pd.to_datetime(pd.Series(['30/05/2020', '01/12/2020']))
tt.dt.month
UserWarning: Parsing dates in DD/MM/YYYY format when dayfirst=False (the default) was specified. This may lead to inconsistently parsed dates! Specify a format to ensure consistent parsing.
tt = pd.to_datetime(pd.Series(['30/05/2020', '01/12/2020']))
0 5
1 1
dtype: int64