Home > other >  Finding set of rows in table based on matching rows from another table
Finding set of rows in table based on matching rows from another table

Time:10-18

I know the topic is a bit vague at best, but cannot find a way to describe my problem better...

An example, I have the following two tables:

TableA

IdA Code Value
123 A 1
123 B 2
123 C 3
456 A 4
456 F 6
456 E 7
...

TableB

IdB Code Value
X A 1
X B 2
X C 3
Y G 2
Y D 8
Y C 3
Z A 1
Z B 2
Z C 3
Z D 5
...

A set of records for a given IdA in TableA correlates to an equivalent set of records in TableB having a specific IdB.

For instance, for IdA = 123 in TableA, I have exactly three rows with certain codes and values, this would "map" to rows with IdB = X in TableB because it has the same combination of Codes and Values and the same number of rows. Note that it would not map to IdB = Z in TableB, because it has an additional row for Code D which IdA = 123 doesn't have in TableA.

Given only IdA, how to best write a query to find IdB?

If the codes and values were known, I could have done something similar to this:

SELECT b.IdB FROM TableB b
WHERE
    EXISTS(SELECT * FROM TableB x WHERE x.IdB = b.IdB AND x.Code = 'A' AND x.Value = '1') AND
    EXISTS(SELECT * FROM TableB x WHERE x.IdB = b.IdB AND x.Code = 'B' AND x.Value = '2') AND
    EXISTS(SELECT * FROM TableB x WHERE x.IdB = b.IdB AND x.Code = 'C' AND x.Value = '3') AND
    (SELECT COUNT(*) FROM TableB x WHERE x.IdB = b.IdB) = 3

But now I'm only given a value for IdA, so I need to look up values from TableA and combine that in the query for TableB. Any clever ideas on how to tackle this?

CodePudding user response:

I'm not exactly clear on what your expected output is which you don't provide, except for your own query which returns just IdB.

The supplied sample data is very limited and doesn't cover many "what if" scenarios, but going with it, the following will return the required values of X.

Using a CTE, first count the number of unique IdA / IdB to use when joining the two tables, then make sure the final row counts per Id also match the unique count:

with a as (
    select *, Count(*) over (partition by ida) r
    from tablea
), b as (
    select *, Count(*) over (partition by idb) r
    from tableb
) 
select b.idb
from a join b on a.code=b.code and a.value=b.value and a.r=b.r
group by b.idb, b.r
having Count(*)=b.r

CodePudding user response:

This is a question of Relational Division Without Remainder.

There are many solutions, here is one:

  • Take TableB and left join TableA to it
  • But calculate a total over the whole set of values from A
  • Group by IdB
  • Filter so we only have rows where the total count is equal to the number of matches to A (because COUNT(IdA) only counts non-nulls) and the total count must also be the same as the total number of rows that we want to match to.
DECLARE @idA int = 123;

SELECT
  b.IdB
FROM TableB b
LEFT JOIN (
    SELECT *,
      total = COUNT(*) OVER ()
    FROM TableA a
    WHERE a.IdA = @idA
) a ON b.Code = a.Code AND b.Value = a.Value
GROUP BY
  b.IdB
HAVING COUNT(*) = COUNT(a.IdA)
  AND COUNT(*) = MIN(a.total);

db<>fiddle

  • Related