Home > Net >  Cumulative sum of counts is not working for distinct values of users
Cumulative sum of counts is not working for distinct values of users

Time:06-29

Table "users":

id name email created_at
46 FSDSD2 [email protected] 2022-05-29 14:19:21
47 Fxz3 [email protected] 2022-05-30 20:12:15
48 Fgh3 [email protected] 2022-05-31 20:12:15
49 Fghxc3 [email protected] 2022-06-01 20:12:15
50 Fdx3 [email protected] 2022-06-02 20:12:15
51 Fg3q3 [email protected] 2022-06-03 20:12:15
88 Fbhgt [email protected] 2022-05-23 16:38:41
112 Fht [email protected] 2022-05-24 16:19:23
113 Y14gss [email protected] 2022-05-25 16:42:44
114 sfhf [email protected] 2022-05-26 12:10:40
115 A2czu [email protected] 2022-05-27 14:00:31
116 Cc1sn [email protected] 2022-05-28 12:04:56

Table "oxygen_point_earns":

id user_id oxygen_point created_at
2 116 50.00 2022-05-23 17:49:30
3 113 10.00 2022-05-24 07:49:46
4 114 10.00 2022-05-25 07:50:42
5 46 50.00 2022-05-26 07:55:19
6 47 40.00 2022-05-27 13:28:17
7 48 30.00 2022-05-28 13:32:19
8 49 10.00 2022-05-29 13:32:19
9 50 5.00 2022-05-30 13:32:19
10 51 10.00 2022-05-31 13:32:19
11 88 20.00 2022-06-01 13:32:19
12 112 50.00 2022-06-02 13:32:19
13 115 10.00 2022-06-03 13:32:19
14 112 20.00 2022-06-03 16:32:19

I have two tables:

  • "users", which stores users basic information
  • "oxygen_point_earns", which stores oxygen points earned by specific users

The "users" table has 12 rows, though the "oxygen_point_earns" table contains 13 records, which means that one user can win points even more than once.

I was trying to made some calculation between those tables (e.g. dividing the total of weekly gained points by the weekly users cumulative sum, for each user). The problem occurs when I attempt to get the users cumulative sum.

SELECT STR_TO_DATE(CONCAT(YEARWEEK(op.created_at), ' Sunday'), '%X%V %W') AS week,
       SUM(COUNT(*)) OVER(ORDER BY MIN(op.created_at)) AS user_count,
       SUM(op.oxygen_point)                            AS op_weekly
FROM      users us
LEFT JOIN oxygen_point_earns op 
       ON us.id = op.user_id
GROUP  BY week
ORDER  BY week 

This query gets me the following output:

enter description here

As you can see, even though the points are correctly computed, the total user count is wrong at the second row: it should be 12 instead of 13 (First week I got 6 users then next week 6 more users registered. So my total user count is 12. On second row I should get 12.)

I tried DISTINCT, GROUP_CONCAT but didn't work. How can I fix this query to get true result of users counts?

CodePudding user response:

One straightforward option is to separate the two operations (aggregation and windowing) using a subquery/cte:

WITH cte AS (
    SELECT STR_TO_DATE(CONCAT(YEARWEEK(op.created_at), ' Sunday'), '%X%V %W') AS week,
           COUNT(DISTINCT user_id)                                            AS cnt,
           SUM(op.oxygen_point)                                               AS op_weekly
    FROM      users us
    LEFT JOIN oxygen_point_earns op ON us.id = op.user_id
    GROUP  BY week
)
SELECT week,
       SUM(cnt) OVER(ORDER BY week)  AS user_count,
       op_weekly
FROM cte
ORDER BY week 
  • Related