SQL: Find Items based on their attributes-CodePudding

On PostgreSQL, I have a table item, and a table item_attribute. So an item can have multiple attributes.

table: item

id	name
1	A
2	B
3	C

table: item_attribute

id	item_id	name	value
1	1	foo	bar
2	1	size	M
3	1	country	DE
4	2	foo	baz
5	2	size	M
6	2	country	DE
7	3	other	test
8	3	size	M
9	3	country	EN

Now, I need to find items that match a given set of attribute name/value pairs

So, let's say I want to find items with attributes of country with value DE and size with value M the result set should be

id	name
1	A
2	B

if I want to find items with attributes of country with value EN and size with value M the result set should be

id	name
3	C

Any hints here?

CodePudding user response：

Using an aggregation approach we can try:

SELECT i.id, i.name
FROM item i
INNER JOIN item_attribute ia
    ON ia.item_id = i.id
GROUP BY i.id, i.name
HAVING COUNT(*) FILTER (WHERE ia.name = 'country' AND ia.value = 'DE') > 0 AND
       COUNT(*) FILTER (WHERE ia.name = 'size' AND ia.value = 'M') > 0;

The above uses conditional aggregation to find items from Germnay in medium size.

CodePudding user response：

You can find the intersection between all "item_id"s that satisfy your conditions inside the "item_attribute" table, then match the corresponding value inside the "item" table.

WITH filtered_ids AS (
    SELECT item_id FROM item_attribute WHERE name  = 'country' AND value = 'DE'
    INTERSECT
    SELECT item_id FROM item_attribute WHERE name  = 'size'    AND value = 'M'
)
SELECT * 
FROM item
WHERE id IN (SELECT * FROM filtered_ids)

You can stack as many conditions as you want inside the "filtered_ids" cte.

Check the demo here.

CodePudding user response：

One option is to aggregate the attributes into a JSON value and use the quite powerful JSON operators to find the items

select itm.*
from item itm
where exists (select null
              from item_attribute att
              where att.item_id = itm.id
              group by item_id
              having jsonb_object_agg(name, value) @> '{"country": "DE", "size": "M"}'
              )

I am not not sure if this is faster or slower than the traditional approach with counting (see Tim Biegeleisen's answer). It is a bit more flexible though.

Depending on the data distribution, doing the aggregation once for all items might be faster:

select itm.*
from item itm
  join (
    select item_id, jsonb_object_agg(name, value) as attributes
    from item_attribute att
    group by item_id
    having jsonb_object_agg(name, value) @> '{"country": "DE", "size": "M"}'
  ) atr on atr.item_id = itm.id

This also enables to display all attributes (as JSON) in a column of the result.