SQL query: Multiple SQL joins for multiple column checks?-CodePudding

I am new to this forum. Except for basics, I didn't have much of exposure to more advanced SQL queries. Hence I'm seeking any advice that can help me understand better the following problem.

Let's say I have two tables (A and B). The primary keys are A.id and B.id. In addition, let's say, table B has two more fields, B.a_key (foreign key for A.id) and B.feature. Therefore, table B may(usually) have more entries since multiple rows in B may be pointing to a single A.id.

I have a query that is in the following format:

[{B.feature=1}, {B.feature=0}, {B.feature=3}]

Return ALL A.idthat have at least 3 entries in table B and those 'feature' values are (1, 0, 3)

[{B.feature=0}]

Return ALL A.id that have at least 1 entry in B with the'feature' value (0)

[{B.feature=0}, {B.feature=2}, {B.feature=5}, {B.feature=6}]

Return ALL A.id that have at least 4 entries in table B and those 'feature' values are (0,2,5,6)

[]

Trivial case: Return ALL A.id (no need to filter) ...

Thanks.

I was thinking of preparing an SQL statement with as many JOINS as B entries and then filtering them in the where clause, but I think I'm catching more than needed.

E.g sample 1:

select distinct A.id 
from A
left join B b on A.id = b.a_key 
left join B bb on A.id = bb.a_key
left join B bbb on A.id = bbb.a_key
where (b.feature = 1 and bb.feature = 0 and bbb.feature = 3);

Edit 1: As per Jonas' suggestion, I'm adding a data sample and expected results.

INSERT INTO A (id) values (1);
INSERT INTO A (id) values (2);
INSERT INTO A (id) values (3);

INSERT INTO B (id, a_key, feature) values (1000, 1, 0);
INSERT INTO B (id, a_key, feature) values (1001, 1, 1);
INSERT INTO B (id, a_key, feature) values (1002, 1, 2);
INSERT INTO B (id, a_key, feature) values (1003, 1, 3);
INSERT INTO B (id, a_key, feature) values (1004, 1, 4);
INSERT INTO B (id, a_key, feature) values (1005, 1, 5);
INSERT INTO B (id, a_key, feature) values (1006, 1, 6);


INSERT INTO B (id, a_key, feature) values (2000, 2, 0);
INSERT INTO B (id, a_key, feature) values (2001, 2, 1);
INSERT INTO B (id, a_key, feature) values (2002, 2, 2);
INSERT INTO B (id, a_key, feature) values (2003, 2, 3);


INSERT INTO B (id, a_key, feature) values (3000, 3, 0);
INSERT INTO B (id, a_key, feature) values (3001, 3, 1);

[{b.feature=0}, {b.feature=1}] Select statement should return all 3 IDs from table A since there are at least 2 entries in table B (with feature values 0 and 1) for every ID from A.

[{b.feature=0}, {b.feature=1}, {b.feature=3}] Select statement should return only id=1 and id=2 from table A, since in table B we don't have b.feature=3 and b.a_key=3.

CodePudding user response：

Use array_agg to aggregate the values for a given a_key(a.id). Then use compare the aggregated array to an array of the desired id's.

create table a(id integer);
INSERT INTO A (id) values (1);
INSERT INTO A (id) values (2);
INSERT INTO A (id) values (3);
create table b (id integer, a_key integer, feature integer);
INSERT INTO B (id, a_key, feature) values (1000, 1, 0);
INSERT INTO B (id, a_key, feature) values (1001, 1, 1);
INSERT INTO B (id, a_key, feature) values (1002, 1, 2);
INSERT INTO B (id, a_key, feature) values (1003, 1, 3);
INSERT INTO B (id, a_key, feature) values (1004, 1, 4);
INSERT INTO B (id, a_key, feature) values (1005, 1, 5);
INSERT INTO B (id, a_key, feature) values (1006, 1, 6);
INSERT INTO B (id, a_key, feature) values (2000, 2, 0);
INSERT INTO B (id, a_key, feature) values (2001, 2, 1);
INSERT INTO B (id, a_key, feature) values (2002, 2, 2);
INSERT INTO B (id, a_key, feature) values (2003, 2, 3);
INSERT INTO B (id, a_key, feature) values (3000, 3, 0);
INSERT INTO B (id, a_key, feature) values (3001, 3, 1);


SELECT
    a_key AS id,
    agg
FROM (
    SELECT
        a_key,
        array_agg(feature) AS agg
    FROM
        b
    WHERE
        feature IN (0, 1)
    GROUP BY
        a_key) AS a
WHERE
    ARRAY[0, 1] = a.agg
ORDER BY
    id

id |  agg  
---- -------
  3 | {0,1}
  1 | {0,1}
  2 | {0,1}



SELECT
    a_key AS id,
    agg
FROM (
    SELECT
        a_key,
        array_agg(feature) AS agg
    FROM
        b
    WHERE
        feature IN (0, 1, 3)
    GROUP BY
        a_key) AS a
WHERE
    ARRAY[0, 1, 3] = a.agg
ORDER BY 
    id;

 id |   agg   
---- ---------
  1 | {0,1,3}
  2 | {0,1,3}