Home > other >  How to join tables with multiple conditions?
How to join tables with multiple conditions?

Time:10-03

I am new to making complex (let me know if this one is not complex) SQL queries and am struggling to create a query that will join 2 tables with multiple different combinations. Here is my query for just the 1st combination.

SELECT DISTINCT
    A.Id, A.Ref1, A.Ref2, A.Ref3, B.Ref1, B.Ref2, B.Ref3
FROM
    A
JOIN
    B ON TRIM(UPPER(B.Ref1)) = TRIM(UPPER(A.Ref1)) 
      AND TRIM(UPPER(B.Ref2)) = TRIM(UPPER(A.Ref2)) 
      AND TRIM(UPPER(B.Ref3)) = TRIM(UPPER(A.Ref3))

Ref1 - Ref3 are columns that are necessary in the 1st combination and this is classified as an "Exact Match".

However, if it so happens that only Ref1 and Ref2 matches, then it is classified as a "Partial Match" instead. Here is a list of possible combinations:

  • A.Ref1 = B.Ref1 and A.Ref2 = B.Ref2 and A.Ref3 = B.Ref3 = Exact Match
  • A.Ref1 = B.Ref1 and A.Ref2 = B.Ref2 = Partial Match 1
  • A.Ref2 = B.Ref2 and A.Ref3 = B.Ref3 = Partial Match 1
  • A.Ref1 = B.Ref1 and A.Ref3 = B.Ref3 = Partial Match 1
  • A.Ref1 = B.Ref1 = Partial Match 2
  • A.Ref2 = B.Ref2 = Partial Match 2
  • A.Ref3 = B.Ref3 = Partial Match 2

My initial attempt at this looked to have been working just fine but eventually broke once I added values that would make the result set non-distinct between the matches.

Anyway, for the actual goal, I intended for Exact Match entries to not be included when searching Partial Match 1 entries, then for Partial 2, Partial Match 1 and Exact Match should not be included (not sure if I am making enough sense at this point).

My initial attempts keep duplicating records which is really just redundant data since if it already matched exactly, the record should not have been partially matched as well. Here is a sample of the result I got:

Id     A.Ref1     A.Ref2     A.Ref3     B.Ref1     B.Ref2     B.Ref3     Match Type
 1     Val1       Val1       Val1       Val1       Val1       Val1       Exact Match
 1     Val1       Val1       NULL       Val1       Val1       NULL       Partial Match 1
 1     Val1       NULL       Val1       Val1       NULL       Val1       Partial Match 1
 1     Val1       NULL       NULL       Val1       NULL       NULL       Partial Match 2

In the scenario above, since Id 1 already is an "Exact Match", I don't want it to appear multiple times for "Partial Match" except for when there are other distinct field values for the record.

I am trying to do this all in just 1 query but I don't think there should be limitations to have them as separate ones instead. It would probably be easier that way but please do let me know if there is a possibility of it being done in just 1 go as this will be executed several times.

Any help on the matter will be much appreciated. Thank you.

CodePudding user response:

You can cross join the two tables and then count how many columns match:

select
  id, a_ref1, a_ref2, a_ref3, b_ref1, b_ref2, b_ref3,
  case match_count 
    when 3 then 'Exact Match'
    when 2 then 'Partial Match 1'
    when 1 then 'Partial Match 2'
    else 'No Match'
  end as match_type
from
(
  select 
     a.id, 
     a.ref1 as a_ref1, a.ref2 as a_ref2, a.ref3 as a_ref3,
     b.ref1 as b_ref1, b.ref2 as b_ref2, b.ref3 as b_ref3,
     case when trim(upper(b.ref1)) = trim(upper(a.ref1)) then 1 else 0 end  
     case when trim(upper(b.ref2)) = trim(upper(a.ref2)) then 1 else 0 end  
     case when trim(upper(b.ref3)) = trim(upper(a.ref3)) then 1 else 0 end
       as match_count
  from a cross join b
) match_counted
-- where match_count = 3 /* only Exact Matches */
-- where match_count = 2 /* only Partial Matches Type 1 */
-- where match_count = 1 /* only Partial Matches Type 2 */
-- where match_count = 0 /* only Non-Matches */
;

You can use any of the suggested WHERE clauses in order to only get a particular match type. Or a variation if you want to pick several, e.g. where match_count >0 or where match_count in (1,2).

If you want to exclude non-matches, you can inner join the tables instead on trim(upper(b.ref1)) = trim(upper(a.ref1)) or trim(upper(b.ref2)) = trim(upper(a.ref2)) or trim(upper(b.ref3)) = trim(upper(a.ref3)). This may make the query faster (but is not guaranteed to do so). It will, however, make the query a tad more prone to errors and a lower its maintainability, because you'd state the conditions twice.

CodePudding user response:

You can use CASE WHEN ... THEN in your SELECT clause to implement your match rules.

SELECT * 
FROM (
  Select Distinct A.Id, A.Ref1, A.Ref2, A.Ref3, B.Ref1, B.Ref2, B.Ref3,
         CASE 
           WHEN A.Ref1 = B.Ref1 AND A.Ref2 = B.Ref2 AND A.Ref3 = B.Ref3
           THEN 'Exact Match'

           WHEN A.Ref1 = B.Ref1 AND A.Ref2 = B.Ref2
           THEN 'Partial Match 1'
           WHEN A.Ref2 = B.Ref2 AND A.Ref3 = B.Ref3
           THEN 'Partial Match 1'
           WHEN A.Ref1 = B.Ref1 AND A.Ref3 = B.Ref3
           THEN 'Partial Match 1'

           WHEN (A.Ref1 = B.Ref1 OR A.Ref2 = B.Ref2 OR A.Ref3 = B.Ref3)
           THEN 'Partial Match 2'

           ELSE NULL
         END Match
    from A
    Join B
        On  TRIM(UPPER(B.Ref1)) = TRIM(UPPER(A.Ref1)) And 
            TRIM(UPPER(B.Ref2)) = TRIM(UPPER(A.Ref2)) And
            TRIM(UPPER(B.Ref3)) = TRIM(UPPER(A.Ref3))
  ) subquery
WHERE Match = 'Partial Match 1'

This gets the desired items from your resultset, omitting the ones you don't want.

SQL is all about sets and subqueries.

It's a bit verbose, but so are your matching rules.

  • Related