Home > Mobile >  Corresponding week column for each different participant with starting dates
Corresponding week column for each different participant with starting dates

Time:10-05

I have a sample dataframe as given below.

import pandas as pd
data = {'ID':['A','A','A','A','A', 'B','B','B','C','C','C','C'],
    'Date':['9/20/2021  4:34:57 AM', '9/21/2021  6:54:22 PM', '9/28/2021  5:54:22 PM', 
           '9/30/2021  6:54:22 PM', '10/5/2021  6:54:22 PM',
           '9/4/2021  7:04:38 PM','9/9/2021  7:04:38 PM','9/17/2021  7:04:38 PM',
           '9/28/2021  3:21:23 AM','9/30/2021  3:21:23 AM','10/05/2021  3:21:23 
            AM','10/15/2021  3:21:23 AM']}
    
 df1 = pd.DataFrame(data)
 df1

The 'Date' column is in datetime format.

Now for each participant, the first entry is the enrolled date and that would be the start of week 1 for that particular participant. I want corresponding week number for the subsequent date entries, considering the first entry as Day 0.

The final dataframe should look like the image given below.

enter image description here

Any help is greatly appreciated. Thank you.

CodePudding user response:

Try:

df['Date'] = pd.to_datetime(df['Date'])

df['Week'] = df.groupby('ID')['Date'] \
               .apply(lambda t: 1   (t - min(t)).dt.days // 7)

df['Week'] = 'Week'   df['Week'].astype(str)

Output:

>>> df
   ID                Date   Week
0   A 2021-09-20 04:34:57  Week1
1   A 2021-09-21 18:54:22  Week1
2   A 2021-09-28 17:54:22  Week2
3   A 2021-09-30 18:54:22  Week2
4   A 2021-10-05 18:54:22  Week3
5   B 2021-09-04 19:04:38  Week1
6   B 2021-09-09 19:04:38  Week1
7   B 2021-09-17 19:04:38  Week2
8   C 2021-09-28 03:21:23  Week1
9   C 2021-09-30 03:21:23  Week1
10  C 2021-10-05 03:21:23  Week2
11  C 2021-10-15 03:21:23  Week3

Fixed by @ddejohn

CodePudding user response:

You can use:

df1['Date'] = pd.to_datetime(df1['Date'])

df1['Week'] = (df1.groupby('ID')['Date'].diff().dt.days
                  .groupby(df1['ID']).cumsum()
                  .fillna(0, downcast='infer') // 7   1)

df1['Week'] = 'Week'   df1['Week'].astype(str)

Result:

print(df1)

   ID                Date   Week
0   A 2021-09-20 04:34:57  Week1
1   A 2021-09-21 18:54:22  Week1
2   A 2021-09-28 17:54:22  Week2
3   A 2021-09-30 18:54:22  Week2
4   A 2021-10-05 18:54:22  Week3
5   B 2021-09-04 19:04:38  Week1
6   B 2021-09-09 19:04:38  Week1
7   B 2021-09-17 19:04:38  Week2
8   C 2021-09-28 03:21:23  Week1
9   C 2021-09-30 03:21:23  Week1
10  C 2021-10-05 03:21:23  Week2
11  C 2021-10-15 03:21:23  Week3
  • Related