Home > Software design >  adding session counter in existing Dataframe
adding session counter in existing Dataframe

Time:09-27

Existing Dataframe :

Unique_id       Landing_Page     Action        Action_Date
A1                Home           Opened         10-12-2021 10:15:30.255
A1                Home           Opened         10-12-2021 10:16:25.355
A1                Home           Opened         10-12-2021 10:20:30.400
A1                T&C            Clicked        10-12-2021 10:21:26.100
A1               Pay_page        Success        10-12-2021 10:23:30.250
A1                Home           Opened         10-12-2021 10:40:50.255 
B1                Home           Opened         12-12-2021 11:25:40.900

Expected Dataframe :

    Unique_id       Landing_Page     Action             Action_Date              Session_Conter
    A1                Home           Opened         10-12-2021 10:15:30.255             1
    A1                Home           Opened         10-12-2021 10:16:25.355             2
    A1                Home           Opened         10-12-2021 10:20:30.400             3
    A1                T&C            Clicked        10-12-2021 10:21:26.100             3
    A1               Pay_page        Success        10-12-2021 10:23:30.250             3
    A1                Home           Opened         10-12-2021 10:40:50.255             4
    B1                 Home           Opened         12-12-2021 11:25:40.900            1

New column named 'Session_Counter' needs to be added. Conditions for Counting a session : if Landing Page is Home and Action is Opened that is the start of new session.

Tried Grouping the Unique_id and applied cumsum() , but stuck with applying the condition.

Also, is it possible to calculate the session timing(total time spent) for the successful session(successful session implies , session starting with(Landing Page = Home , Action = opened) and ending with Action = Success)

CodePudding user response:

with this i got the expected answer..!! Thanks @Ynjxsjmh for the heads up

df['Session_Conter'] = ((df['Landing_Page'].eq('Home') & df['Action'].eq('Opened'))
                              .groupby(df['Unique_id']).cumsum())

CodePudding user response:

You can cumsum on the condition when Landing_Page is Home and Action is Opened

df['Session_Conter'] = (df.groupby('Unique_id')
                        .apply(lambda g: (g['Landing_Page'].eq('Home') & g['Action'].eq('Opened')).cumsum())
                        .reset_index(drop=True))
print(df)

  Unique_id Landing_Page   Action              Action_Date  Session_Conter
0        A1         Home   Opened  10-12-2021 10:15:30.255               1
1        A1         Home   Opened  10-12-2021 10:16:25.355               2
2        A1         Home   Opened  10-12-2021 10:20:30.400               3
3        A1          T&C  Clicked  10-12-2021 10:21:26.100               3
4        A1     Pay_page  Success  10-12-2021 10:23:30.250               3
5        A1         Home   Opened  10-12-2021 10:40:50.255               4
6        B1         Home   Opened  12-12-2021 11:25:40.900               1
  • Related