In Postgresql, how do I use joins with multiple conditions including >= and <=-CodePudding

I have table A and table B. Each row in table A represents every time a user sends a message. Each row in table B represents every time a user buys a gift.

Goal: for each time a user sends a message, calculate how many gifts they've purchased within 7 days before the timestamp they sent the message. Some users never send messages and some never purchased gifts. If the user in table A didn't have gift purchased within 7 days, the count should be 0.

Table A:

user_id	time
12345	2021-09-04 09:43:55
12345	2021-09-03 00:39:30
12345	2021-09-02 03:26:07
12345	2021-09-05 15:48:34
23456	2021-09-09 09:06:22
23456	2021-09-08 08:06:21
00001	2021-09-03 15:38:15
00002	2021-09-03 15:38:15

Table B:

user_id	time
12345	2021-09-01 09:43:55
12345	2021-08-03 00:42:30
12345	2021-09-03 02:16:07
00003	2021-09-05 15:48:34
23456	2021-09-03 09:06:22
23456	2021-09-10 08:06:21

Expected output:

user_id	time	count
12345	2021-09-04 09:43:55	2
12345	2021-09-03 00:39:30	1
12345	2021-09-02 03:26:07	1
12345	2021-09-05 15:48:34	2
23456	2021-09-09 09:06:22	1
23456	2021-09-08 08:06:21	1
00001	2021-09-03 15:38:15	0
00002	2021-09-03 15:38:15	0

Query I tried:

SELECT A.user_id, A.time, coalesce(count(*), 0) as count
FROM A
LEFT JOIN B ON A.user_id = B.user_id AND B.time >= A.time - INTERVAL '7 days' AND B.time < A.time
GROUP BY 1,2

The count returned doesn't match the expected result however, not sure if I'm doing the join and conditions correctly.

CodePudding user response：

You need to count the values from the possibly NULL columns i.e. from table B in order to get the correct counts of non-existent matches. i.e. being more specific in COUNT(*) to COUNT(b.column_from_b_table). See modification with working demo fiddle below:

SELECT 
    A.user_id, 
    A.time, 
    coalesce(count(B.user_id), 0) as count
FROM A
LEFT JOIN B ON A.user_id = B.user_id AND 
               B.time >= A.time - INTERVAL '7 days' AND 
               B.time < A.time
GROUP BY 1,2;

user_id	time	count
1	2021-09-03T15:38:15.000Z	0
12345	2021-09-05T15:48:34.000Z	2
23456	2021-09-08T08:06:21.000Z	1
12345	2021-09-04T09:43:55.000Z	2
12345	2021-09-03T00:39:30.000Z	1
23456	2021-09-09T09:06:22.000Z	1
2	2021-09-03T15:38:15.000Z	0
12345	2021-09-02T03:26:07.000Z	1

View on DB Fiddle

Let me know if this works for you.