Home > front end >  How can I change timestamp to readable date in pandas when I have mixed timestamp formats
How can I change timestamp to readable date in pandas when I have mixed timestamp formats

Time:11-09

I have a column with different timestamp formats as shown below. I want to convert the date column to readable date. Since the timestamp units are mixed, I find other converted properly while others default to 1970. Is there a way I can convert them together or convert them to a unix timestamp unit before converting to readable date so that it happens uniformly.

data = ["2022-04-14 17:31:03.023","2022-04-20 12:49:50.295",1647597943249,1647519101441,"2022-03-19 18:10:59.024"]
df = pd.DataFrame(data, columns=['date'])
df['newdate'] = pd.to_datetime(df['date'], unit='ns')
df

    date                     newdate
0   2022-04-14 17:31:03.023  2022-04-14 17:31:03.023000
1   2022-04-20 12:49:50.295  2022-04-20 12:49:50.295000
2   1647597943249            1970-01-01 00:27:27.597943249
3   1647519101441            1970-01-01 00:27:27.519101441
4   2022-03-19 18:10:59.024  2022-03-19 18:10:59.024000

If I change unit to 'ms' I get


ValueError: non convertible value 2022-04-14 17:31:03.023 with the unit 'ms'.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/_libs/tslib.pyx in pandas._libs.tslib.array_with_unit_to_datetime()

ValueError: could not convert string to float: '2022-04-14 17:31:03.023'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
4 frames
/usr/local/lib/python3.7/dist-packages/pandas/_libs/tslib.pyx in pandas._libs.tslib.array_with_unit_to_datetime()

ValueError: non convertible value 2022-04-14 17:31:03.023 with the unit 'ms'```

CodePudding user response:

Idea is replace values with missing values if not possible convert to datetimes and then use Series.fillna:

df['newdate'] = (pd.to_datetime(df['date'], unit='ms', errors='coerce')
                   .fillna(pd.to_datetime(df['date'], errors='coerce')))

print (df)
                      date                 newdate
0  2022-04-14 17:31:03.023 2022-04-14 17:31:03.023
1  2022-04-20 12:49:50.295 2022-04-20 12:49:50.295
2            1647597943249 2022-03-18 10:05:43.249
3            1647519101441 2022-03-17 12:11:41.441
4  2022-03-19 18:10:59.024 2022-03-19 18:10:59.024
  • Related