I have 3 tables, A, B and C structured like this
CREATE TABLE a (
id SERIAL NOT NULL PRIMARY KEY
);
CREATE TABLE b (
id SERIAL NOT NULL PRIMARY KEY,
a_id INT REFERENCES a(id) ON DELETE CASCADE
);
CREATE TABLE c (
id SERIAL NOT NULL PRIMARY KEY,
a_id INT REFERENCES a(id) ON DELETE CASCADE
);
Where the relationships are many-to-one. What i want is, for every row in table b, i want to check if any row in table c has a reference to the same row in table a. Now, I already have the query
SELECT
b.id,
true
FROM
b
WHERE EXISTS (
SELECT 1
FROM c
WHERE b.a_id = c.a_id
)
UNION
SELECT
b.id,
false
FROM
b
WHERE NOT EXISTS (
SELECT 1
FROM c
WHERE b.a_id = c.a_id
)
ORDER BY id
Though I am not certain, I think this is doing double work, and going through the table twice, and I am wondering how I could optimize it to only traverse the table once.
Is it possible with a simple query, or do I have to do anything complex?
CodePudding user response:
Simply move the EXISTS
clause into your SELECT
clause.
SELECT
b.id,
EXISTS (SELECT null FROM c WHERE c.a_id = b.a_id) AS c_exists
FROM b;
The same with an IN
clause, which I prefer for being even a tad simpler:
SELECT
id,
a_id IN (SELECT c.a_id FROM c) AS c_exists
FROM b;
CodePudding user response:
This can be done with a subquery, a left join, and a case.
The subquery gets you a list of distinct c.a_id
values.
SELECT DISTINCT a_id FROM c;
Then do this
SELECT b.id,
CASE WHEN distinct_ids.a_id IS NULL THEN 'false'
ELSE 'true' END has_c_row
FROM b
LEFT JOIN (
SELECT DISTINCT a_id FROM c;
) distinct_ids ON b.a_id = distinct_ids.a_id
This shape of query is called an antijoin or IS NULL ... LEFT JOIN
. It detects the rows in the first table that don't match rows in the second table.
The subquery gives us a view of the data in table c
with at most one row per each distinct a_id
value. Without the subquery, we might get duplicate rows in the result query.
This eliminates your WHERE EXISTS
correlated subqueries; even though PostgreSQL's query planner is pretty smart, sometimes it does the slow thing with subqueries like that.
If it is still too slow for you, create these indexes on the a_id columns.
ALTER TABLE b ADD INDEX a_id (a_id);
ALTER TABLE c ADD INDEX a_id (a_id);
CodePudding user response:
i think i understand what you are after
this is how i would do it
SELECT b.id, ISNULL(res.result,0) as result
FROM b
LEFT JOIN (
SELECT c.id, 1 as result
FROM c
INNER JOIN a on a.id = c.id
) res on b.id = res.id
i dont think you need to worry about distinct if they are all unique ids