I'm using Fedex Dataset from kaggle. there is a coulum name Actual_Shipment_Time which contain numbers from single digit to four digits which I'm trying to convert to time
for example
5 needs to be 00:05,
23 needs to be 00:23
345 needs to be 03:45
2145 needs to be 21:45
even 12 hours format is also accepetable.
when i run this code
df['Actual_Shipment_Time'] = pd.to_datetime(df['Actual_Shipment_Time'], format = '%H%M').dt.strftime('%H%M')
it gives me this error.
ValueError: time data '9' does not match format '%H%M' (match)
CodePudding user response:
You can try zfill
the column to 4 numbers
df['Actual_Shipment_Time'] = (pd.to_datetime(df['Actual_Shipment_Time']
.astype(str)
.str.zfill(4), format='%H%M')
.dt.strftime('%H:%M'))
print(df)
Actual_Shipment_Time
0 00:05
1 00:23
2 03:45
3 21:45
There is no such time as 24:00
, so Pandas gives following error
$ pd.to_datetime('2400', format='%H%M')
ValueError: unconverted data remains: 0
To overcome it, we can use errors='coerce'
df['Actual_Shipment_Time'] = (pd.to_datetime(df['Actual_Shipment_Time']
.fillna(0).astype(int).astype(str) # fill nan value by 0 and convert column to int then to string
.str.zfill(4), format='%H%M', errors='coerce')
.dt.strftime('%H:%M')
.fillna('00:00')) # fill nat value by 00:00, nat is from 24:00