Currently we have the following sample DB structure with a Many-to-Many relationship between lead
and tag
tables.
CREATE TABLE lead ( id serial
constraint lead_id_pk
primary key, name VARCHAR(20), surname VARCHAR(20) );
CREATE TABLE tag ( id serial
constraint tag_id_pk
primary key, name VARCHAR(20), description Text );
create table leads_tags
(
id serial
constraint lead_tags_id_pk
primary key,
lead_id integer not null
constraint leads_tags_leads_id_fk
references lead
on delete cascade,
tag_id integer not null
constraint leads_tags_tags_id_fk
references tag
on delete cascade,
constraint leads_tags_lead_id_tag_id_key
unique (lead_id, tag_id)
);
Let's imagine that we would like to perform the following queries.
- Give me all the leads that have ANY of the tags in a list.
- Give me all the leads that have ALL of the tags in the list
For the first one (1) we can do:
Select distinct (id), *
FROM lead
where id in (
Select lt.lead_id
from leads_tags lt
where tag_id in (324, 129)
)
For the second one (2) where we want to search for Leads that have both tags. Which is the most performant option according to your opinion? Is this the best approach?
SELECT *
FROM lead
WHERE id IN (
SELECT lt.lead_id
FROM leads_tags lt
GROUP BY lt.lead_id
HAVING array_agg(tag_id) @> array [324,129]
)
CodePudding user response:
Provided unique (lead_id, tag_id)
I would say
SELECT *
FROM lead
WHERE id IN (
SELECT lt.lead_id
FROM leads_tags lt
GROUP BY lt.lead_id
where tag_id in (324, 129)
HAVING count(*) = 2
)