I have the following dataframe:
id key serial time
0 310000 first 100 2022-01-15 07:28:00
1 310000 second 100 2022-01-15 07:29:00
3 450000 first 200 2022-01-15 07:45:00
4 450000 second 200 2022-01-15 07:46:00
5 450000 third 200 2022-01-15 07:47:00
I want to workout the time difference in seconds between rows given that:
- They have same id and same serial
- Start time for the group is if key=first exist. Otherwise null.
Ideal outcome:
id key serial time difference
0 310000 first 100 2022-01-15 07:28:00 0
1 310000 second 100 2022-01-15 07:29:00 60
3 450000 first 200 2022-01-15 07:45:00 0
4 450000 second 200 2022-01-15 07:46:00 60
5 450000 third 200 2022-01-15 07:47:00 120
CodePudding user response:
updated answer: difference from start of group
df['time'] = pd.to_datetime(df['time'])
df['difference'] = (df.groupby(['id', 'serial'])['time']
.transform(lambda g: g-g.iloc[0])
.dt.total_seconds()
.fillna(0, downcast='infer')
)
output:
id key serial time difference
0 310000 first 100 2022-01-15 07:28:00 0
1 310000 second 100 2022-01-15 07:29:00 60
3 450000 first 200 2022-01-15 07:45:00 0
4 450000 second 200 2022-01-15 07:46:00 60
5 450000 third 200 2022-01-15 07:47:00 120
older answer
Use a groupby.diff
:
df['time'] = pd.to_datetime(df['time'])
df['difference'] = (df.groupby(['id', 'serial'])['time']
.diff().fillna('0', downcast='infer')
)
Output:
id key serial time difference
0 310000 first 100 2022-01-15 07:28:00 0 days 00:00:00
1 310000 second 100 2022-01-15 07:29:00 0 days 00:01:00
3 450000 first 200 2022-01-15 07:45:00 0 days 00:00:00
4 450000 second 200 2022-01-15 07:46:00 0 days 00:01:00
5 450000 third 200 2022-01-15 07:47:00 0 days 00:01:00
Seconds as integers:
df['time'] = pd.to_datetime(df['time'])
df['difference'] = (df.groupby(['id', 'serial'])['time']
.diff()
.dt.total_seconds()
.fillna(0, downcast='infer')
)
Output:
id key serial time difference
0 310000 first 100 2022-01-15 07:28:00 0
1 310000 second 100 2022-01-15 07:29:00 60
3 450000 first 200 2022-01-15 07:45:00 0
4 450000 second 200 2022-01-15 07:46:00 60
5 450000 third 200 2022-01-15 07:47:00 60