I have a data frame of employees and job dates for a month. I need to index each job for every given day like so.
employee | job_time | nth_job |
---|---|---|
Harry | 1/11/21 9:00 | 1 |
Harry | 1/11/21 9:30 | 2 |
Harry | 1/11/21 10:00 | 3 |
Harry | 2/11/21 9:00 | 1 |
Harry | 2/11/21 9:30 | 2 |
Ben | 1/11/21 9:00 | 1 |
Ben | 1/11/21 9:30 | 2 |
Ben | 1/11/21 10:00 | 3 |
Ben | 2/11/21 9:00 | 1 |
Ben | 2/11/21 9:30 | 2 |
Thank you.
CodePudding user response:
Convert column to datetimes, so possible use GroupBy.cumcount
with monthly periods:
#format MM/DD/YY HH:MM
df['job_time'] = pd.to_datetime(df['job_time'], format='%m/%d/%y %H:%M')
#format DD/MM/YY HH:MM
#df['job_time'] = pd.to_datetime(df['job_time'], format='%d/%m/%y %H:%M')
df['nth'] = df.groupby(['employee', df['job_time'].dt.to_period('m')]).cumcount().add(1)
print (df)
employee job_time nth_job nth
0 Harry 2021-01-11 09:00:00 1 1
1 Harry 2021-01-11 09:30:00 2 2
2 Harry 2021-01-11 10:00:00 3 3
3 Harry 2021-02-11 09:00:00 1 1
4 Harry 2021-02-11 09:30:00 2 2
5 Ben 2021-01-11 09:00:00 1 1
6 Ben 2021-01-11 09:30:00 2 2
7 Ben 2021-01-11 10:00:00 3 3
8 Ben 2021-02-11 09:00:00 1 1
9 Ben 2021-02-11 09:30:00 2 2