I have a column with different timestamp formats as shown below. I want to convert the date column to readable date. Since the timestamp units are mixed, I find other converted properly while others default to 1970. Is there a way I can convert them together or convert them to a unix timestamp unit before converting to readable date so that it happens uniformly.
data = ["2022-04-14 17:31:03.023","2022-04-20 12:49:50.295",1647597943249,1647519101441,"2022-03-19 18:10:59.024"]
df = pd.DataFrame(data, columns=['date'])
df['newdate'] = pd.to_datetime(df['date'], unit='ns')
df
date newdate
0 2022-04-14 17:31:03.023 2022-04-14 17:31:03.023000
1 2022-04-20 12:49:50.295 2022-04-20 12:49:50.295000
2 1647597943249 1970-01-01 00:27:27.597943249
3 1647519101441 1970-01-01 00:27:27.519101441
4 2022-03-19 18:10:59.024 2022-03-19 18:10:59.024000
If I change unit to 'ms' I get
ValueError: non convertible value 2022-04-14 17:31:03.023 with the unit 'ms'.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/_libs/tslib.pyx in pandas._libs.tslib.array_with_unit_to_datetime()
ValueError: could not convert string to float: '2022-04-14 17:31:03.023'
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
4 frames
/usr/local/lib/python3.7/dist-packages/pandas/_libs/tslib.pyx in pandas._libs.tslib.array_with_unit_to_datetime()
ValueError: non convertible value 2022-04-14 17:31:03.023 with the unit 'ms'```
CodePudding user response:
Idea is replace values with missing values if not possible convert to datetimes and then use Series.fillna
:
df['newdate'] = (pd.to_datetime(df['date'], unit='ms', errors='coerce')
.fillna(pd.to_datetime(df['date'], errors='coerce')))
print (df)
date newdate
0 2022-04-14 17:31:03.023 2022-04-14 17:31:03.023
1 2022-04-20 12:49:50.295 2022-04-20 12:49:50.295
2 1647597943249 2022-03-18 10:05:43.249
3 1647519101441 2022-03-17 12:11:41.441
4 2022-03-19 18:10:59.024 2022-03-19 18:10:59.024