Home > front end >  Snowflake SQL - Nested Counting with Date Spine
Snowflake SQL - Nested Counting with Date Spine

Time:07-16

CONTEXT:

I have a user sessions table that has the following structure. This table tracks daily user sessions.

SESSION_DATE    USER_ID
1/1/2022         1
1/1/2022         2
1/1/2022         3
1/1/2022         4
1/2/2022         5

Additionally, I have a billing table. This table tracks user ids that are billed daily.

BILLED_DATE USER_ID
1/1/2022    1
1/1/2022    2
1/1/2022    3
1/2/2022    4
1/2/2022    5

QUESTION

Taking this data and truncating it to a weekly level, of user sessions, what count of users are also being billed weekly?

So for example, lets say there are something like 100k weekly user sessions (distinct count of user_ID). Of those weekly 100k user session, how many are billed weekly?

Target Output

WEEK    WEEKLY_COUNT_OF_USER_SESSIONS   WEEKLY_BILLED_USERS
1/1/2022    100k                        40K
1/8/2022    110k                        50k
1/15/2022   105k                        45K
1/22/2022   120k                        56K

ATTEMPTED TRIAL

I am trying to get a count within a count. Of Weekly user sessions, in that week, how many of those users also were billed. The second column should have a smaller count of users than the middle column above. The query below for some reason isnt accurately counting weekly count of distinct user ids and is returning 0 for weekly billed users

    WITH USER_SESSIONS_CTE AS (
    SELECT DATE_TRUNC(week, EVENT_DATE)::DATE AS WEEK,
                USER_ID,
                'Active_User_Session' AS STATUS_1
    FROM USER_SESSIONS
 
    ),
        BILLED_USERS_CTE AS (
            SELECT U.USER_ID,
                   DATE_TRUNC(week,DATE)::DATE AS WEEK,
                   'BILLED_USER' AS STATUS_2,
                   S.STATUS_1
                FROM BILLED_USERS U
            FULL JOIN USER_SESSIONS_CTE S
                ON S.USER_ID = U.USER_ID AND S.WEEK = DATE_TRUNC(week, U.DATE)::DATE
        )
    
    SELECT Week,
           COUNT(DISTINCT CASE WHEN STATUS_1 = 'Active_User_Session' THEN USER_ID END) AS WEEKLY_COUNT_OF_USER_SESSIONS,
           COUNT(DISTINCT CASE WHEN STATUS_2 = 'BILLED_USER' THEN USER_ID END) AS WEEKLY_BILLED_USERS
        FROM BILLED_USERS_CTE
GROUP BY 1
ORDER BY 1 DESC

CodePudding user response:

Need more data to verify, but this should be more directionally accurate for an approach:

-- Create tables:

    create
    or replace table user_sessions (event_date date, user_id integer);
    create
    or replace table billed_users (event_date date, user_id integer);

-- Dummy data:

    insert into
        user_sessions
    values
        (to_date('1/1/2022', 'mm/dd/yyyy'), 1),(to_date('1/1/2022', 'mm/dd/yyyy'), 2),
        (to_date('1/1/2022', 'mm/dd/yyyy'), 3),
        (to_date('1/1/2022', 'mm/dd/yyyy'), 4),(to_date('1/2/2022', 'mm/dd/yyyy'), 5);
    insert into
        billed_users
    values
        (to_date('1/1/2022', 'mm/dd/yyyy'), 1),(to_date('1/1/2022', 'mm/dd/yyyy'), 2),
        (to_date('1/1/2022', 'mm/dd/yyyy'), 3),
        (to_date('1/2/2022', 'mm/dd/yyyy'), 4),(to_date('1/2/2022', 'mm/dd/yyyy'), 5);

-- Create views:

    create
        or replace view v_USER_SESSIONS AS (
            SELECT
                DATE_TRUNC(
                    week,
                    event_date)::DATE AS WEEK,
                    user_id,
                    count(USER_ID) SESSION_COUNT,
                    'Active_User_Session' AS SESSION
                    FROM
                        USER_SESSIONS
                    group by
                        1,
                        2
                );

 create or replace view v_BILLED_USERS AS (
                    SELECT
                        DATE_TRUNC(
                            week,
                            event_date)::DATE AS WEEK,
                            user_id,
                            count(USER_ID) BILL_COUNT,
                            'BILLED_USERS' AS BILLED
                            FROM
                                BILLED_USERS
                            group by
                                1,
                                2
                        );

-- Generate results

    SELECT
        vbu.Week,
        sum(vbu.bill_count) as billed,
        sum(vus.session_count) as sessions
    from
        v_billed_users vbu 
        full join v_user_sessions vus on vus.user_id = vbu.user_id and 
        vus.week = vbu.week
    group by
        1;

Results (limited to sample, more data spanning weeks should prove more accurate):

WEEK        BILLED  SESSIONS
2021-12-27  5       5

CodePudding user response:

Answer that worked as shown below

WITH
         USER_SESSION AS (
            SELECT
                DATE_TRUNC(week,event_date)::DATE AS WEEK,
                    USER_ID                 as ACTIVE_USER
                    FROM USER_SESSION
                    order by 1 desc
                ),

BILLED_USER AS (
                    SELECT
                        DATE_TRUNC(week, to_timestamp(DATE_ASSIGNED/1000))::DATE AS WEEK,
                            USERID                                               AS BILLED_USER       
                    FROM
                        BILLED_USERS
                    GROUP BY 1,2
                        )

SELECT a.WEEK,
       COUNT(DISTINCT a.ACTIVE_USER)                                          AS USER_SESSIONS_TOTAL,
       COUNT(DISTINCT aa.ASS_USER)                                            AS ASSIGNED_USERS_TOTAL

FROM USER_SESSION a
LEFT JOIN BILLED_USER aa
    ON a.ACTIVE_USER = aa.ASS_USER AND a.WEEK = aa.WEEK
GROUP BY 1
ORDER BY 1 DESC;
  • Related