I'm wanting to find how long LocID_2 has spend at location Loc_2.
For example, the time LocID_2 006 spent at Loc_2 11.0 is 1 second. The time LocID_2 006 spent at Loc_2 12.0 is 3 second. The time LocID_2 003 spent at Loc_2 14.0 is 11 second, and so on. I need to derive time_at_loc_1
TimeStamp Loc_1 Loc_2 LocID_2 time_at_loc_1
0 2022-07-24 07:06:48 10.331905 11.0 006 0:00:01
1 2022-07-24 07:06:49 10.479264 11.0 006 0:00:01
2 2022-07-24 07:06:50 10.538208 12.0 006 0:00:03
3 2022-07-24 07:06:51 10.685568 12.0 006 0:00:03
4 2022-07-24 07:06:52 10.744512 12.0 006 0:00:03
5 2022-07-24 07:06:53 10.862399 12.0 006 0:00:03
6 2022-07-24 07:07:32 10.862399 17.0 006 0:00:05
7 2022-07-24 07:07:33 10.891870 17.0 006 0:00:05
8 2022-07-24 07:07:35 11.068702 17.0 006 0:00:05
9 2022-07-24 07:07:37 11.186589 17.0 006 0:00:05
10 2022-07-24 07:07:39 11.363420 14.0 003 0:00:11
11 2022-07-24 07:07:50 15.422364 14.0 003 0:00:11
The code I currently have. Just not sure how to iterate properly to derive the time difference and allocate correctly. Currently it takes that last derived time (e.g. 0:00:11) and allocates it to the entire 'time_at_loc_1' column.
lst_RFID = []
for k in df['Loc_2']:
lst_RFID.append(k)
timedelta = []
for x in set(lst_RFID):
df_RFID = df.loc[df['Loc_2'].isin([x])]
time_min1 = pd.to_datetime(df_RFID["TimeStamp"].min(), format='%Y-%m-%d %H:%M:%S')
time_max1 = pd.to_datetime(df_RFID["TimeStamp"].max(), format='%Y-%m-%d %H:%M:%S')
time_over1 = time_max1 - time_min1
for index, row in df.iterrows():
time_delta = str(datetime.timedelta(seconds=time_over1.seconds))
if row['Loc_2']) == x:
df['time_at_loc_1'] = time_delta
Assistance greatly appreciated to derive 'time_at_loc_1'
tks
CodePudding user response:
Iterrows is a trap. It's basically never the best option, and often leads to more trouble than it's worth. Doing things through manual iteration throws out the whole point of using pandas.
df.TimeStamp = pd.to_datetime(df.TimeStamp)
df['time_at_loc_1'] = df.groupby('Loc_2')['TimeStamp'].transform(lambda x: x.max()-x.min())
print(df)
Output:
TimeStamp Loc_1 Loc_2 LocID_2 time_at_loc_1
0 2022-07-24 07:06:48 10.331905 11.0 006 0 days 00:00:01
1 2022-07-24 07:06:49 10.479264 11.0 006 0 days 00:00:01
2 2022-07-24 07:06:50 10.538208 12.0 006 0 days 00:00:03
3 2022-07-24 07:06:51 10.685568 12.0 006 0 days 00:00:03
4 2022-07-24 07:06:52 10.744512 12.0 006 0 days 00:00:03
5 2022-07-24 07:06:53 10.862399 12.0 006 0 days 00:00:03
6 2022-07-24 07:07:32 10.862399 17.0 006 0 days 00:00:05
7 2022-07-24 07:07:33 10.891870 17.0 006 0 days 00:00:05
8 2022-07-24 07:07:35 11.068702 17.0 006 0 days 00:00:05
9 2022-07-24 07:07:37 11.186589 17.0 006 0 days 00:00:05
10 2022-07-24 07:07:39 11.363420 14.0 003 0 days 00:00:11
11 2022-07-24 07:07:50 15.422364 14.0 003 0 days 00:00:11