In Postgres how do I write a SQL query to select distinct values overall but aggregated over a set t-CodePudding

What I mean by this is if I have a table called payments with a created_at column and user_id column I want to select the count of purchases aggregated weekly (can be any interval I want) but only selecting first time purchases e.g. if a user purchased for the first time in week 1 it would be counted but if he purchased again in week 2 he would not be counted.

created_at	user_id
timestamp	1
timestamp	1

This is the query I came up with. The issue is if the user purchases multiple times they are all included. How can I improve this?

WITH dates AS 
(
    SELECT *
    FROM generate_series(
      '2022-07-22T15:30:06.687Z'::DATE,
      '2022-11-21T17:04:59.457Z'::DATE,
      '1 week'
    ) date
)
SELECT 
    dates.date::DATE AS date, 
    COALESCE(COUNT(DISTINCT(user_id)), 0) AS registrations
FROM
    dates
LEFT JOIN  
    payment ON created_at::DATE BETWEEN dates.date AND dates.date::date   '1 ${dateUnit}'::INTERVAL
GROUP BY 
    dates.date
ORDER BY 
    dates.date DESC;

CodePudding user response：

You want to count only first purchases. So get those first purchases in the first step and work with these.

WITH dates AS 
(
    SELECT *
    FROM generate_series(
      '2022-07-22T15:30:06.687Z'::DATE,
      '2022-11-21T17:04:59.457Z'::DATE,
      '1 week'
    ) date
)
, first_purchases AS
(
    SELECT user_id, MIN(created_at:DATE) AS purchase_date
    FROM payment
    GROUP BY user_id
)
SELECT 
    d.date, 
    COALESCE(COUNT(p.purchase_date), 0) AS registrations
FROM
    dates d
LEFT JOIN  
    first_purchases p ON p.purchase_date >= d.date 
                     AND p.purchase_date <  d.date   '1 ${dateUnit}'::INTERVAL
GROUP BY 
    d.date
ORDER BY 
    d.date DESC;