Home > Software design >  How to find the total number of events before the second login time
How to find the total number of events before the second login time

Time:12-20

In BigQuery, I'm trying to find the total number of events before the second login time.

For different user ids, I have multiple events such as "scroll," "user engagement," "log in," "first_visit," "sign_up" etc. For simplicity, let's consider the above as all the events.

For instance, for user_id 2, I have the following information extracted from the raw data (this is a snapshot of the table).

User_id Event_name EventTime
2 scroll 2022-10-31 12:28:35
2 sign_up 2022-10-29 08:11:29
2 login 2022-11-01 16:46:34
2 first_visit 2022-10-30 10:45:22
2 login 2022-11-04 08:10:38
2 scroll 2022-11-05 11:18:35
2 user engagement 2022-11-06 08:45:17
2 user engagement 2022-11-07 05:27:32

First, I found the second login time for each user id.

WITH cte AS (
    SELECT *, RANK() OVER (PARTITION BY User_id ORDER BY LoginTime) rnk
    FROM MyData
)

SELECT User_id, LoginTime AS SecondLoginTime
FROM cte
WHERE rnk = 2
ORDER BY User_id;

User_id SecondLoginTime
1 2022-11-07 09:52:27
2 2022-11-04 08:10:38

I wanted to write a query where I can compare this login time to each event time (for all user ids)and count the events before the SecondLoginTime.

For instance, for user_id 2, I want to make the following comparison:

User_id Event_name EventTime SecondLoginTime
2 scroll 2022-10-31 12:28:35 2022-11-04 08:10:38
2 sign_up 2022-10-29 08:11:29 2022-11-04 08:10:38
2 login 2022-11-01 16:46:34 2022-11-04 08:10:38
2 first_visit 2022-10-30 10:45:22 2022-11-04 08:10:38
2 login 2022-11-04 08:10:38 2022-11-04 08:10:38
2 scroll 2022-11-05 11:18:35 2022-11-04 08:10:38
2 user engagement 2022-11-06 08:45:17 2022-11-04 08:10:38
2 user engagement 2022-11-07 05:27:32 2022-11-04 08:10:38

And find the followig result:

User_id TotalEventsBeforeFirstVisit
2 4

I also want to apply this logic to all user ids.

Is there a way to do this? Please kindly ask me for clarification if there is anything misssing or if the problem is unclear. I'd really appreciate your help.

CodePudding user response:

This should work:

with LI2 AS (
    SELECT User_Id, EventTime as LoginTime, ROW_NUMBER() OVER (PARTITION BY User_id ORDER BY EventTime) Seq
    FROM MyTbl
    where Event_Name='login'
)
SELECT 
    LI2.User_id
  , LI2.LoginTime AS SecondLoginTime
  , COUNT(OE.User_ID) as EventsBefore2ndLogin
FROM LI2
     left join
     MyTbl OE -- other events before 2nd login
     on OE.User_Id=LI2.USer_Id
        and OE.EventTime<LI2.LoginTime
WHERE LI2.Seq = 2
GROUP BY 
    LI2.User_id
  , LI2.LoginTime
ORDER BY LI2.User_id

MyTbl is your table.

CodePudding user response:

You might consider below.

WITH cte AS (
  SELECT *, NTH_VALUE(IF(Event_name = 'login', EventTime, NULL), 2 IGNORE NULLS) OVER w AS SecondLoginTime
    FROM MyData
  WINDOW w AS (PARTITION BY User_id ORDER BY EventTime ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
)
SELECT User_id, COUNTIF(EventTime < SecondLoginTime) AS TotalEventsBeforeSecondLogin
  FROM cte
 GROUP BY 1;

Query results

enter image description here

  • Related