i have this dataframe
Matricule DateTime Date Time
1 1000 2022-01-06 10:59:51 2022-01-06 10:59:51
2 1000 2022-01-07 08:40:09 2022-01-07 08:40:09
3 1000 2022-01-26 15:39:10 2022-01-26 15:39:10
4 1104 2022-01-03 14:33:38 2022-01-03 14:33:38
81 1104 2022-01-04 10:04:18 2022-01-04 10:04:18
... ... ... ... ...
1548 1803 2022-01-24 15:51:22 2022-01-24 15:51:22
1549 1803 2022-01-24 15:51:29 2022-01-24 15:51:29
1550 1803 2022-01-24 16:54:23 2022-01-24 16:54:23
1551 1803 2022-01-28 14:42:01 2022-01-28 14:42:01
1552 1803 2022-01-28 14:42:32 2022-01-28 14:42:32
i want to calculate time difference between the first time of the day and last time of the day for each day for every employee to know how much hours he spent at work daily for exemple
Matricule Date WorkTime
1 1000 2022-01-06 1
2 1000 2022-01-07 3
3 1000 2022-01-26 5
4 1104 2022-01-03 2
81 1104 2022-01-04 8
CodePudding user response:
you can use the split-apply-combine approach, write a func for each group and apply on the groupby:
grpd = df.groupby(['Matricule', 'Date'])
def get_hours(df):
start = df['Time'].min()
end = df['Time'].max()
new_df = pd.DataFrame([end-start], columns=['WorkTime'])
return new_df
grpd.apply(get_hours)