Firstly, I got two tables one of them for storing points with their related id. Another table for users simply I store users there.
I got 12 users in my users table. I got 13 records in my oxygen_point_earns table. That means one user can win points more than once.
I was trying to made some calculation between those tables. For example I want to divide below values according to my query;
total_gained_points_weekly / users_per_week_cumulative_sum
I got total points there is no issue then when I want to take user_count I got it wrong.
There is my query;
SELECT str_to_date(concat(yearweek(op.created_at), ' Sunday'), '%X%V %W') AS week,
sum(count(*)) over ( order by min(op.created_at)) AS user_count,
sum(op.oxygen_point) as op_weekly
FROM users us
LEFT JOIN oxygen_point_earns op
ON us.id = op.user_id
GROUP BY week
ORDER BY week
The result is;
as you can see the total user count is wrong at second row. But points calculated well. Even I got 12 users in the table so the count at second row should be 12 but how the output count is 13?
First week I got 6 users then next week 6 more users registered. So my total user count is 12. On second row I should get 12..
I tried DISTINCT, GROUP_CONCAT but didn't work.
How can I fix this query to get true result of users counts.
example users table csv file:
id,name,email,created_at
46,FSDSD2,[email protected],"2022-05-29 14:19:21",
47,Fxz3,[email protected],"2022-05-30 20:12:15",
48,Fgh3,[email protected],"2022-05-31 20:12:15",
49,Fghxc3,[email protected],"2022-06-01 20:12:15",
50,Fdx3,[email protected],"2022-06-02 20:12:15",
51,Fg3q3,[email protected],"2022-06-03 20:12:15",
88,Fbhgt,[email protected],"2022-05-23 16:38:41",
112,Fht,[email protected],"2022-05-24 16:19:23",
113,Y14gss,[email protected],"2022-05-25 16:42:44",
114,sfhf,[email protected],"2022-05-26 12:10:40",
115,A2czu,[email protected],"2022-05-27 14:00:31",
116,Cc1sn,[email protected],"2022-05-28 12:04:56"
so above users can gain points more than once..
Example oxygen_point_earns table csv file;
id,user_id,oxygen_point,created_at,
2,116,50.00,"2022-05-23 17:49:30",
3,113,10.00,"2022-05-24 07:49:46",
4,114,10.00,"2022-05-25 07:50:42",
5,46,50.00,"2022-05-26 07:55:19",
6,47,40.00,"2022-05-27 13:28:17",
7,48,30.00,"2022-05-28 13:32:19",
8,49,10.00,"2022-05-29 13:32:19",
9,50,5.00,"2022-05-30 13:32:19",
10,51,10.00,"2022-05-31 13:32:19",
11,88,20.00,"2022-06-01 13:32:19",
12,112,50.00,"2022-06-02 13:32:19",
13,115,10.00,"2022-06-03 13:32:19",
14,112,20.00,"2022-06-03 16:32:19"
So above example user 112 gained points more than once.
CodePudding user response:
One straightforward option is to separate the two operations (aggregation and windowing) using a subquery/cte:
WITH cte AS (
SELECT STR_TO_DATE(CONCAT(YEARWEEK(op.created_at), ' Sunday'), '%X%V %W') AS week,
COUNT(DISTINCT user_id) AS cnt,
SUM(op.oxygen_point) AS op_weekly
FROM users us
LEFT JOIN oxygen_point_earns op ON us.id = op.user_id
GROUP BY week
)
SELECT week,
SUM(cnt) OVER(ORDER BY week) AS user_count,
op_weekly
FROM cte
ORDER BY week