I have a dataset like this:
userid | productid | score |
---|---|---|
A | 1 | 4 |
A | 2 | 4 |
A | 3 | 5 |
B | 1 | 4 |
B | 2 | 4 |
B | 3 | 5 |
I want to have an output like this:
userid1 | userid2 | matching_product |
---|---|---|
A | B | 1 2 3 |
but I'm only able to get the first two column with this query:
CREATE TABLE score_greater_than_3 AS
SELECT userid, productid, score
FROM reviews
WHERE score >= 4;
SELECT s1.userid as userid1, s2.userid as userid2
FROM score_greater_than_3 s1
INNER JOIN score_greater_than_3 s2 ON s1.productid=s2.productid AND s1.userid<s2.userid
GROUP BY s1.userid, s2.userid
HAVING count(*)>=3;
How can i get the matching products? i'm ok with an output like this too if its more easy
user1 | user2 | matched product |
---|---|---|
a | b | 1 |
a | b | 2 |
a | b | 3 |
CodePudding user response:
You can try with the following query:
WITH cte AS (
SELECT r.userid,
r.productid
FROM reviews r
WHERE r.score > 3
)
SELECT r1.userid,
r2.userid,
GROUP_CONCAT(r1.productid SEPARATOR ' ')
FROM cte r1
INNER JOIN cte r2
ON r1.productid = r2.productid
AND r1.userid < r2.userid
GROUP BY r1.userid,
r2.userid
HAVING COUNT(*) >= 3
It uses a Common Table Expression that allows you to allocate your table in a temporary space that lasts till the end of the query. Your next step using the self join is correct, though it lacks the GROUP_CONCAT aggregation function, that allows you to aggregate on a string-like field. You can set the "separator" parameter to decide which string you want to use to concatenate your values.
Try the full query here.
CodePudding user response:
Please try this:
select s1.userid as userid1, s2.userid as userid2,
GROUP_CONCAT(s1.productid)
from score_greater_than_3 s1 inner join score_greater_than_3 s2 on s1.productid=s2.productid and s1.userid<s2.userid
group by s1.userid, s2.userid;
The whole script please see: DB Fiddle