I have a table in the following format
ID | Attribute |
---|---|
1 | A |
2 | AA |
3 | A |
4 | B |
5 | BB |
1 | BBB |
2 | BBBB |
6 | A |
7 | A |
8 | AA |
99 | A |
8 | B |
8 | BBBB |
6 | AA |
The goal is to only extract these rows where the ID matches at least once "A" or "AA" and "B", "BB", "BBB" or "BBBB". The data however can match these attributes several times or only once. Neverthesless, it must match the "A" and "B" attribute groups at least once. IDs appear at least once or several times but an upper limit of appearance is unknown. With this example, the result should be the following:
ID | Attribute |
---|---|
1 | A |
1 | BBB |
2 | AA |
2 | B |
8 | AA |
8 | B |
8 | BBBB |
So far I have
select *
from table
where (table.attribute = "A" or table.attribute = "AA" or table.attribute = "B" or table.attribute = "BB" or table.attribute = "BBB" or table.attribute = "BBBB")
This naturally selects all IDs even if they only match the "A" or "B" attribute groups once not taking into account if they also match the other attribute at least once.
The solution provided here doesn't work as the IDs can appear several times with either the "A" or "B" attributes without having the other attribute once (as an example see 6 in table above). Somehow I need to differentiate between the attribute groups but I'm at a loss how to do it.
CodePudding user response:
I think the following provides your desired output, selecting only the IDs that exist with distinct counts of both qualifying attribute groups:
select *
from t
where id in (
select id
from t
group by id
having Count(distinct case when attribute like 'A%' then 1 end) > 0
and Count(distinct case when attribute like 'B%' then 1 end) > 0
)
order by id;