Home > Enterprise >  correct parsing for a str date time in ddmmyyyy:HH:MM:SS.xxx format
correct parsing for a str date time in ddmmyyyy:HH:MM:SS.xxx format

Time:02-26

I have the following datetime values in dask dataframe saved as string dates:

ddf = dd.DataFrame({'date': ['15JAN1955:13:15:27.369', NaN,'25DEC1990:23:18:17.200', '06MAY1962:02:55:27.360', NaN, '20SEP1975:12:02:26.357']}

I used ddf['date'].apply(lambda x: datetime.strptime(x,"%d%b%Y:%H:%M:%S.%f"), meta=datetime) but I get a TypeError: strptime() argument 1 must be a str, not float error.

I am following the way dates were parsed from the book: Data Science with python and dask.

Is the .%f expecting a float? Or maybe it has something to do with NaN values?

CodePudding user response:

You may use %f that parses any decimal fraction of seconds with up to 6 digits

Also 20SEPT1975 should be 20SEP1975 (no T in month)

import pandas as pd
import numpy as np

df = pd.DataFrame({'date': ['15JAN1955:13:15:27.369', np.nan,
                            '25DEC1990:23:18:17.200', np.nan,
                            '06MAY1962:02:55:27.360', '20SEP1975:12:02:26.357']})

df['date'] = pd.to_datetime(df['date'], format="%d%b%Y:%H:%M:%S.%f")
print(df)
                     date
0 1955-01-15 13:15:27.369
1                     NaT
2 1990-12-25 23:18:17.200
3                     NaT
4 1962-05-06 02:55:27.360
5 1975-09-20 12:02:26.357
  • Related