Home > OS >  How to Query a Ordered Hash Table (PostgreSQL)
How to Query a Ordered Hash Table (PostgreSQL)

Time:03-08

I have a hash table (for tagging items) with an extra column for the strength of the relationship between items and their tags:

item_id | tag_name | relationship
1       | linux    | 0.7
1       | computer | 0.9
2       | garden   | 0.5
2       | shovel   | 0.65
...

I want to select all items tagged with both 'linux' and 'computer' (more tags are allowed), ordered by the sum of the relationships between items and these two tags.

Here's my attempt at querying the database:

SELECT * FROM items
INNER JOIN (
SELECT items.id FROM items
    INNER JOIN tags ON items.id = tags.item_id
    GROUP BY tags.item_id
    HAVING ARRAY_AGG(tags.tag_id) @> ARRAY['linux','computer']
    ORDER BY SUM(tags.relationship) DESC
) tagged_items ON items.id = tagged_items.id

For items that are only tagged with 'linux' and 'computer', this works perfectly. But the problem is that for items that have additional tags, the ORDER BY SUM(tags.relationship) will sum over the other tags that an item has as well... in other words, items with the most tags will return first (and that's not what I want).

How can I make SUM() only sum over the relationships to the tags that I'm selecting?

CodePudding user response:

You can simply use in:

select id
from items i
where tags_id in ('linux', 'computer')
group by id
order by sum(relationship) desc;

Unless you want all columns, then you can use the query above in cte and join:

with tmp as (
    select id
    from items
    where tags_id in ('linux', 'computer')
    group by id
    order by sum(relationship) desc
)
select i.*
from tmp t
inner join items i on t.id = i.id;
  • Related