Thanks to this answer, I have the following dataframe:
START_POINT END_POINT DURATION
island Stage
1 SLEEP-S0 00:32:03 00:42:33 630.0
2 SLEEP-S1 00:42:33 00:45:03 150.0
3 SLEEP-S0 00:45:03 00:46:03 60.0
4 SLEEP-S1 00:46:03 00:48:33 150.0
5 SLEEP-S2 00:48:33 00:50:03 90.0
... ... ... ... ...
127 SLEEP-S2 09:32:03 09:39:03 420.0
128 SLEEP-S0 09:39:03 09:39:33 30.0
129 SLEEP-S1 09:39:33 09:40:03 30.0
130 SLEEP-S2 09:40:03 09:48:03 480.0
131 SLEEP-S0 09:48:03 NaN NaN
However, I want to convert the times here into times (in float or int) starting from t = 0. For example, this is what I want:
START_POINT END_POINT DURATION
island Stage
1 SLEEP-S0 0 630 630.0
2 SLEEP-S1 630 780 150.0
3 SLEEP-S0 780 840 60.0
4 SLEEP-S1 ... ... ...
5 SLEEP-S2 ... ... ...
... ... ... ... ...
and so on. Can someone please help?
CodePudding user response:
Reference this answer below How to standardize/normalize a date with pandas/numpy?
You could standardize the array of timestamps and then you could multiply the standardized values by 10000000 to convert to an int.
CodePudding user response:
here you go:
df["start"] = df["DURATION"].cumsum().shift().fillna(0)
df["end"] = df["DURATION"].cumsum()
df
Out[5]:
START_POINT END_POINT DURATION start end
island Stage
1 SLEEP-S0 00:32:03 00:42:33 630.0 0.0 630.0
2 SLEEP-S1 00:42:33 00:45:03 150.0 630.0 780.0
3 SLEEP-S0 00:45:03 00:46:03 60.0 780.0 840.0
4 SLEEP-S1 00:46:03 00:48:33 150.0 840.0 990.0
5 SLEEP-S2 00:48:33 00:50:03 90.0 990.0 1080.0
127 SLEEP-S2 09:32:03 09:39:03 420.0 1080.0 1500.0
128 SLEEP-S0 09:39:03 09:39:33 30.0 1500.0 1530.0
129 SLEEP-S1 09:39:33 09:40:03 30.0 1530.0 1560.0
130 SLEEP-S2 09:40:03 09:48:03 480.0 1560.0 2040.0
131 SLEEP-S0 09:48:03 NaN NaN 2040.0 NaN