I am dealing with a dataframe like this:
mydata['TS_START']
0 2022-11-09 00:00:00
1 2022-11-09 00:00:30
2 2022-11-09 00:01:00
3 2022-11-09 00:01:30
4 2022-11-09 00:02:00
...
I would like to create a new column where:
mydata['delta_t']
0 2022-11-09 00:00:30 - 2022-11-09 00:00:00
1 2022-11-09 00:01:00 - 2022-11-09 00:00:30
2 2022-11-09 00:01:30 - 2022-11-09 00:01:00
3 2022-11-09 00:02:00 - 2022-11-09 00:01:30
...
Obtaining something like this (in decimals units hour based):
mydata['delta_t']
0 30/3600
1 30/3600
2 30/3600
3 30/3600
...
I obtained this result using a for cycle, but it is very slow. I would like to obtain a faster solution, using a vectorized form. Do you have any suggestion?
CodePudding user response:
here is one way :
df['date'] = pd.to_datetime(df['date'])
df['delta_t'] = (df['date'] - df['date'].shift(1)).dt.total_seconds()
print(df)
output :
>>
date delta_t
0 2022-11-09 00:00:00 NaN
1 2022-11-09 00:00:30 30.0
2 2022-11-09 00:01:00 30.0
3 2022-11-09 00:01:30 30.0
4 2022-11-09 00:02:00 30.0