I have this data:
date | visitor_id | total_payment |
---|---|---|
2022-01-01 | A | 20 |
2022-01-01 | B | 15 |
2022-01-01 | C | 20 |
2022-01-02 | B | 10 |
2022-01-02 | D | 25 |
I'd like to have daily count of visitor with total_payment equal or greater than 20$, with that being said, result I'm hoping is:
date | count_visitor |
---|---|
2022-01-01 | 2 |
2022-01-02 | 4 |
2022-01-01 is 2 because only A and C have payment more than 20$, however on 2022-01-02 additional 2 more because B is 35$ (sum) and D is 25$.
Is there any possible query for this? I hope I'm clear on my description. Thank you in advance.
CodePudding user response:
You can use this query as solution. First, I calculate cumulative payments of each user. Then, I find the minimum date for each user that exceeds 20$ cumulative payment. At the last step, I count number of users for each minimum date, and also accumulate that number. In the output you don't have to have first_day_users column, but I kept it to make it easier to understand the code.
So the output looks like that:
WITH
data AS(
SELECT "2022-01-01" AS date, "A" AS visitor_id, 20 AS total_payment UNION ALL
SELECT "2022-01-01" AS date, "B" AS visitor_id, 15 AS total_payment UNION ALL
SELECT "2022-01-01" AS date, "C" AS visitor_id, 20 AS total_payment UNION ALL
SELECT "2022-01-02" AS date, "B" AS visitor_id, 10 AS total_payment UNION ALL
SELECT "2022-01-02" AS date, "D" AS visitor_id, 25 AS total_payment
),
user_cumulatives as
(
SELECT
visitor_id,
date,
SUM(total_payment) OVER (PARTITION BY visitor_id ORDER BY date) as cumulative_payment
FROM data
),
user_first_dates as
(
select visitor_id, min(date) as date
from user_cumulatives
where cumulative_payment >= 20
group by 1
)
select date, count(*) as first_day_users, sum(count(*)) over (order by date) as count_visitor
from user_first_dates
group by 1
order by date
CodePudding user response:
Welcome @Indri
The query below will give you a running sum of the rows per day where the total_amount
of greater than of equal to 20, I believe this should give you the answer you are looking for:
WITH data AS(
SELECT "2022-01-01" AS date, "A" AS visitor_id, 20 AS total_payment
UNION ALL
SELECT "2022-01-01" AS date, "B" AS visitor_id, 15 AS total_payment
UNION ALL
SELECT "2022-01-01" AS date, "C" AS visitor_id, 20 AS total_payment
UNION ALL
SELECT "2022-01-02" AS date, "A" AS visitor_id, 10 AS total_payment
UNION ALL
SELECT "2022-01-02" AS date, "D" AS visitor_id, 25 AS total_payment
)
SELECT
*,
COUNT(*) OVER(ORDER BY date)
FROM data
WHERE total_payment >= 20