Home > front end >  Joining 2 tables to generate a table with 3 conditional flags
Joining 2 tables to generate a table with 3 conditional flags

Time:09-28

I have 2 tables:

Table A:

Category Code Description
CATG1 B123 Apple
CATG1 B456 Banana
CATG1 B789 Orange
CATG1 B929 Tomato

Table B:

Category Code Description
CATG1 B123 Apple
CATG1 B627 Mango
CATG1 B789 Orange
CATG2 B929 Tomato

I am trying to generate a table that joins these 2 tables on the column Code but generates flags on

  • codes found only in Table A
  • codes found only in Table B
  • codes found in both tables
  • codes that changed in Category column

The output table should look like:

Code Description Code in Table A only Code in Table B only Code found in both tables Changed Category
B123 Apple N N Y N
B456 Banana Y N N N
B627 Mango N Y N N
B789 Orange N N Y N
B929 Tomato N N Y Y

I am really lost. Thank you for your help!

CodePudding user response:

There is probably a less verbose approach, but sometimes verbosity works like a documentation when implementing complex logic. What you want is a full outer join. This assumes your table doesn't have duplicated or null codes to begin with

select coalesce(a.code, b.code) as code, 
       coalesce(a.description, b.description) as description,
       case when a.code is not null and b.code is null then 'Y' else 'N' end as code_in_table_a_only,
       case when a.code is null and b.code is not null then 'Y' else 'N' end as code_in_table_b_only,
       case when a.code is not null and b.code is not null and a.code=b.code then 'Y' else 'N' end as code_in_both_tables,
       case when a.code is not null and b.code is not null and a.category<>b.category then 'Y' else 'N' end as changed_category
from a 
full outer join b on a.code=b.code

DEMO

CodePudding user response:

You need a full join between the two tables which enables you to compare all rows from each.

There's various ways of implementing the logic but all methods will be pretty much equal:

select 
    Coalesce(a.code, b.code) Code,
    Coalesce(a.description, b.description) Description,
    case when b.code is null and a.code is not null then 'Y' else 'N' end CodeInAOnly,
    case when a.code is null and b.code is not null then 'Y' else 'N' end CodeInBOnly,
    case when a.code=b.code then 'Y' else 'N' end CodeInBoth,
    case when Coalesce(a.category,b.category)=Coalesce(b.Category,a.category) then 'N' else 'Y' end ChangedCategory
from a full join b on a.code = b.code
order by code;

CodePudding user response:

It is possible that certain SQL implementations offer additional features specifically for a use case such as this. Having said that, I will offer a possible solution in bare SQL.

From this point on I will refer to Table A as LeftTable and Table B as RightTable, as it helps to reason about them from a Left/Right perspective, specifically with regards to the join operations.

The presence of the "Changed Category" column probably requires additional specifications. For example, I observe that B789(Orange) is CATG1 in both LeftTable and RightTable. But there is no mention of the possibility of an additional record for B789(Orange) that also has a value of CATG2 for its Category column in either LeftTable or RightTable. In this case, what would be the value of "Changed Category"? For now, I only provide a concatenation of the categories.

As far as bare SQL, a possible solution is a union of something like the following:

(Warning this script is untested as I do not have access to testing tools at the moment; I may be missing/forgetting key details)

SELECT Code, Description, Categories,
       'Y' as 'Code in Left Table Only', 
       'N' as 'Code in Right Table Only', 
       'N' as 'Code in Both Tables'
FROM
       LeftTable
WHERE
       LeftTable.Code NOT IN (SELECT Code FROM RightTable) // only shows results present on left table based on Code column, in other words, only show LeftTable results if they have a match on RightTable

UNION

SELECT Code, Description, Categories,
       'N' as 'Code in Left Table Only', 
       'Y' as 'Code in Right Table Only', 
       'N' as 'Code in Both Tables'
FROM
       RightTable 
WHERE
       RightTable.Code NOT IN (SELECT Code FROM LeftTable) // only shows results present on left table based on Code column, in other words, only show LeftTable results if they have a match on RightTable

UNION

SELECT Code, Description, CONCAT(Category, ', ') as 'Categories', // I concatenate the multiple categories if they are present.
       'N' as 'Code in Left Table Only', 
       'N' as 'Code in Right Table Only', 
       'Y' as 'Code in Both Tables', 
        
FROM
       LeftTable INNER JOIN RightTable ON LeftTable.Code = RightTable.Code

GROUP BY
       Code, Description // since CONCAT is an aggregation function, all other non-fixed values must be grouped

  • Related