Home > Blockchain >  pd.to_datetime doesn't work on values similar to '10/30/2022 7:00:00 AM 01:00'
pd.to_datetime doesn't work on values similar to '10/30/2022 7:00:00 AM 01:00'

Time:11-02

I have a pandas dataframe with the following column:

import pandas as pd
df = pd.DataFrame(['10/30/2022 7:00:00 AM  01:00', '10/31/2022 12:00:00 AM  01:00',
       '10/30/2022 3:00:00 PM  01:00', '10/30/2022 9:00:00 PM  01:00',
       '10/30/2022 5:00:00 PM  01:00', '10/30/2022 10:00:00 PM  01:00',
       '10/30/2022 3:00:00 AM  01:00', '10/30/2022 2:00:00 AM  02:00',
       '10/30/2022 10:00:00 AM  01:00', '10/30/2022 4:00:00 PM  01:00',
       '10/30/2022 1:00:00 AM  02:00'], columns = ['Date'])

I want to convert the date values so it looks like the following:

2022-10-30T00:00:00 02:00

I tried the following; df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%Y-%m-%dT%H:%M:%S')

The code raises an error whenever the .dt is called:

AttributeError: Can only use .dt accessor with datetimelike values

Apparently the code pd.to_datetime() does not work.

Anyone knows how to fix this?

CodePudding user response:

# convert to date 
# convert to your format
df['Date']=df['Date'].apply(pd.to_datetime).apply(lambda x: x.strftime('%Y-%m-%dT%H:%M:%S%z')) 
0     2022-10-30T07:00:00 0100
1     2022-10-31T00:00:00 0100
2     2022-10-30T15:00:00 0100
3     2022-10-30T21:00:00 0100
4     2022-10-30T17:00:00 0100
5     2022-10-30T22:00:00 0100
6     2022-10-30T03:00:00 0100
7     2022-10-30T02:00:00 0200
8     2022-10-30T10:00:00 0100
9     2022-10-30T16:00:00 0100
10    2022-10-30T01:00:00 0200
Name: Date, dtype: object

CodePudding user response:

Add utc=True when converting to datetime. Try this:

import pandas as pd
df = pd.DataFrame(['10/30/2022 7:00:00 AM  01:00', '10/31/2022 12:00:00 AM  01:00',
       '10/30/2022 3:00:00 PM  01:00', '10/30/2022 9:00:00 PM  01:00',
       '10/30/2022 5:00:00 PM  01:00', '10/30/2022 10:00:00 PM  01:00',
       '10/30/2022 3:00:00 AM  01:00', '10/30/2022 2:00:00 AM  02:00',
       '10/30/2022 10:00:00 AM  01:00', '10/30/2022 4:00:00 PM  01:00',
       '10/30/2022 1:00:00 AM  02:00'], columns = ['Date'])

df['Date'] =df['Date'] = pd.to_datetime(df['Date'], utc=True)
print(df['Date'].dt.strftime('%Y-%m-%dT%H:%M:%S'))

Output:

0     2022-10-30T06:00:00
1     2022-10-30T23:00:00
2     2022-10-30T14:00:00
3     2022-10-30T20:00:00
4     2022-10-30T16:00:00
5     2022-10-30T21:00:00
6     2022-10-30T02:00:00
7     2022-10-30T00:00:00
8     2022-10-30T09:00:00
9     2022-10-30T15:00:00
10    2022-10-29T23:00:00
  • Related