Home > Enterprise >  Pandas - Datetime format change to '%m/%d/%Y'
Pandas - Datetime format change to '%m/%d/%Y'

Time:06-13

Hello there stackoverflow community,

I would like to change the datetime format of a column, but I doesn't work and I don't know what I'am doing wrong.

After executing the following code:

df6['beginn'] = pd.to_datetime(df6['beginn'], unit='s', errors='ignore')

I got this output, and thats fine, but i would like to take out the hour to have only %m/%d/%Y left.

ID       DATE
91060    2017-11-10 00:00:00
91061    2022-05-01 00:00:00
91062    2022-04-01 00:00:00
Name: beginn, Length: 91063, dtype: object 

I've tried this one and many others

df6['beginn'] = df6['beginn'].dt.strftime('%m/%d/%Y')

and get the following output:

AttributeError: Can only use .dt accessor with datetimelike values.

But I don't understand why, I've transformed the data with pd.to_datetime or not?

Appreciate any hint you can give me! Thanks a lot!

CodePudding user response:

The reason you have to use errors="ignore" is because not all the dates you are parsing are in the correct format. If you use errors="coerce" like @phi has mentioned then any dates that cannot be converted will be set to NaT. The columns datatype will still be converted to datatime64 and you can then format as you like and deal with the NaT as you want.

Example

A dataframe with one item in Date not written as Year/Month/Day (25th Month is wrong):

>>> df = pd.DataFrame({'ID': [91060, 91061, 91062, 91063], 'Date': ['2017/11/10', '2022/05/01', '2022/04/01', '2055/25/25']})
>>> df
      ID        Date
0  91060  2017/11/10
1  91061  2022/05/01
2  91062  2022/04/01
3  91063  2055/25/25

>>> df.dtypes
ID       int64
Date    object
dtype: object

Using errors="ignore":

>>> df['Date'] = pd.to_datetime(df['Date'], errors='ignore')
>>> df
      ID        Date
0  91060  2017/11/10
1  91061  2022/05/01
2  91062  2022/04/01
3  91063  2055/25/25

>>> df.dtypes
ID       int64
Date    object
dtype: object

Column Date is still an object because not all the values have been converted. Running df['Date'] = df['Date'].dt.strftime("%m/%d/%Y") will result in the AttributeError

Using errors="coerce":

>>> df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
>>> df
      ID       Date
0  91060 2017-11-10
1  91061 2022-05-01
2  91062 2022-04-01
3  91063        NaT

>>> df.dtypes
ID               int64
Date    datetime64[ns]
dtype: object

Invalid dates are set to NaT and the column is now of type datatime64 and you can now format it:

>>> df['Date'] = df['Date'].dt.strftime("%m/%d/%Y")
>>> df
      ID        Date
0  91060  11/10/2017
1  91061  05/01/2022
2  91062  04/01/2022
3  91063         NaN

Note: When formatting datatime64, it is converted back to type object so NaT's are changed to NaN. The issue you are having is a case of some dirty data not in the correct format.

  • Related