I have a pandas DataFrame with the following column (first column = index):
0 14:43:45:921
1 14:43:45:923
2 14:43:45:925
I would like to modify this column, or add another column with the time starting at 0:
0 00:00:00.000
1 00:00:00.002
2 00:00:00.004
So far, I've tried the following code:
df['time'].apply(pd.Timedelta)
This is giving me the following error: expected hh:mm:ss format
To me, the problem is a) convert the time format HH:MM:SS:fff to HH:MM:SS.fff and b) get the timedelta function to work. Anyone has any suggestions? Thanks!
CodePudding user response:
Use to_datetime
:
s = pd.to_datetime(df['time'], format='%H:%M:%S:%f')
Or Series.str.replace
with to_timedelta
:
s = pd.to_timedelta(df['time'].str.replace('(:)(\d )$', r'.\2'))
And then subtract first value:
df['new'] = s.sub(s.iat[0])
print (df)
time new
0 14:43:45:921 0 days 00:00:00
1 14:43:45:923 0 days 00:00:00.002000
2 14:43:45:925 0 days 00:00:00.004000
If need times:
df['new'] = s.sub(s.iat[0])
df['new1'] = df['new'].apply(lambda x: (pd.datetime.min x).time())
print (df)
time new new1
0 14:43:45:921 0 days 00:00:00 00:00:00
1 14:43:45:923 0 days 00:00:00.002000 00:00:00.002000
2 14:43:45:925 0 days 00:00:00.004000 00:00:00.004000
print (type(df.at[0, 'new']))
<class 'pandas._libs.tslibs.timedeltas.Timedelta'>
print (type(df.at[0, 'new1']))
<class 'datetime.time'>