EDIT: This solution worked for me
I have the following dataframe in Python:
Name | days | Start Date | End Date |
---|---|---|---|
EMP1 | 15 | 8/8/22 | 8/26/22 |
EMP2 | 3 | 6/9/22 | 6/13/22 |
EMP3 | 5 | 8/22/22 | 8/26/22 |
EMP3 | 5 | 8/1/22 | 8/5/22 |
EMP3 | 6 | 6/17/22 | 6/24/22 |
EMP4 | 4.5 | 7/18/22 | 7/22/22 |
EMP5 | 5 | 7/18/22 | 7/22/22 |
EMP6 | 5 | 8/15/22 | 8/19/22 |
EMP7 | 9 | 8/22/22 | 9/2/22 |
I need to generate the list of dates between the start and end date for each row whilst keeping the employee name . Is this simple to do in python? Really struggling to get it to do it by row .
desired output (list each day between start and end date and employees name):
Name | date |
---|---|
EMP1 | 8/8/22 |
EMP1 | 8/9/22 |
EMP1 | 8/10/22 |
EMP1 | 8/11/22 |
EMP1 | 8/12/22 |
EMP1 | 8/13/22 |
EMP1 | 8/14/22 |
EMP1 | 8/15/22 |
CodePudding user response:
well, convert the start date ,enddate to date range then explode using that columns :
df['Date'] = df.apply(lambda x: pd.date_range(start=x['Start Date'], end=x['End Date']), axis=1)
output = df.explode('Date').drop(columns = ['days','Start Date','End Date'])
output :
>>
Name Date
0 EMP1 2022-08-08
0 EMP1 2022-08-09
0 EMP1 2022-08-10
0 EMP1 2022-08-11
0 EMP1 2022-08-12
.. ... ...
8 EMP7 2022-08-29
8 EMP7 2022-08-30
8 EMP7 2022-08-31
8 EMP7 2022-09-01
8 EMP7 2022-09-02
[69 rows x 2 columns]