Home > Mobile >  From a subset of foreign keys, get a list of items that contain that subset
From a subset of foreign keys, get a list of items that contain that subset

Time:01-29

I have two tables:

  • pages_interests
    • page_id INT NOT NULL
    • interest_id INT NOT NULL
  • items_interests
    • item_id INT NOT NULL
    • interest_id INT NOT NULL

pages_interest:

page_id interest_id
1 1
1 7
2 1
3 1
3 7
3 89

items_interest:

item_id interest_id
10 1
10 7
10 20
12 1
12 55

I'm trying to figure out how to get SQL to join on multiple rows. Because the page expected items to have an interest of 1 and 7, get items that have both those interests, but don't discard when an item has other interests too. The expected output would be:

page_id item_id
1 10
2 10
2 12

Does anyone have any idea how I could achieve this?

CodePudding user response:

Using an INNER JOIN between the two tables, will make appear only interest_id in common between the two tables. To gather only <page_id, item_id> that have both the two pages, it's sufficient to enforcing COUNT(DISTINCT i.interest_id) = 2 inside the HAVING clause.

SELECT p.page_id, i.item_id
FROM       pages_interest p
INNER JOIN items_interest i
        ON p.interest_id = i.interest_id
GROUP BY p.page_id, i.item_id
HAVING COUNT(DISTINCT i.interest_id) = 2

If you want to generalize on the number of "pages_interest" items, you can do:

SELECT p.page_id, i.item_id
FROM       pages_interest p
INNER JOIN items_interest i
        ON p.interest_id = i.interest_id
GROUP BY p.page_id, i.item_id
HAVING COUNT(DISTINCT i.interest_id) = (SELECT COUNT(DISTINCT interest_id) FROM pages_interest)

Output:

page_id item_id
1 10

Check the demo here.

CodePudding user response:

I think something like this might work. I added a couple of more pages for more realistic test:

;with pages as (
    select *
    from (
        VALUES  (1, 1)
        ,   (1, 7)
        ,   (2, 1)
        ,   (3, 1)
        ,   (3, 7)
        ,   (3, 89)
        ,   (4, 20)
        ,   (5, 55)
        ,   (5, 1)
        ,   (6, 1)
        ,   (6, 13)
    ) t (page_id,interest_id)
)
, items as (
    select  *
        from (
        VALUES  (10, 1)
        ,   (10, 7)
        ,   (10, 20)
        ,   (12, 1)
        ,   (12, 55)
    ) t (item_id,interest_id)
)
select  p.page_id, i.item_id
from    (
    select p.page_id, interest_id, COUNT(*) OVER(PARTITION BY page_id) AS total_interests
    FROM    pages p
    ) p
LEFT JOIN items i
    ON  i.interest_id = p.interest_id
group by p.page_id, i.item_id, p.total_interests
HAVING COUNT(i.item_id) >= p.total_interests

The idea is to keep track of total page interests and then make sure it's no less than item counts (if we miss, LEFT JOIN value becomes null and COUNT ignores is).

  • Related