I have 2 tables:
Table A:
Category | Code | Description |
---|---|---|
CATG1 | B123 | Apple |
CATG1 | B456 | Banana |
CATG1 | B789 | Orange |
CATG1 | B929 | Tomato |
Table B:
Category | Code | Description |
---|---|---|
CATG1 | B123 | Apple |
CATG1 | B627 | Mango |
CATG1 | B789 | Orange |
CATG2 | B929 | Tomato |
I am trying to generate a table that joins these 2 tables on the column Code
but generates flags on
- codes found only in Table A
- codes found only in Table B
- codes found in both tables
- codes that changed in
Category
column
The output table should look like:
Code | Description | Code in Table A only | Code in Table B only | Code found in both tables | Changed Category |
---|---|---|---|---|---|
B123 | Apple | N | N | Y | N |
B456 | Banana | Y | N | N | N |
B627 | Mango | N | Y | N | N |
B789 | Orange | N | N | Y | N |
B929 | Tomato | N | N | Y | Y |
I am really lost. Thank you for your help!
CodePudding user response:
There is probably a less verbose approach, but sometimes verbosity works like a documentation when implementing complex logic. What you want is a full outer join
. This assumes your table doesn't have duplicated or null
codes to begin with
select coalesce(a.code, b.code) as code,
coalesce(a.description, b.description) as description,
case when a.code is not null and b.code is null then 'Y' else 'N' end as code_in_table_a_only,
case when a.code is null and b.code is not null then 'Y' else 'N' end as code_in_table_b_only,
case when a.code is not null and b.code is not null and a.code=b.code then 'Y' else 'N' end as code_in_both_tables,
case when a.code is not null and b.code is not null and a.category<>b.category then 'Y' else 'N' end as changed_category
from a
full outer join b on a.code=b.code
CodePudding user response:
You need a full join between the two tables which enables you to compare all rows from each.
There's various ways of implementing the logic but all methods will be pretty much equal:
select
Coalesce(a.code, b.code) Code,
Coalesce(a.description, b.description) Description,
case when b.code is null and a.code is not null then 'Y' else 'N' end CodeInAOnly,
case when a.code is null and b.code is not null then 'Y' else 'N' end CodeInBOnly,
case when a.code=b.code then 'Y' else 'N' end CodeInBoth,
case when Coalesce(a.category,b.category)=Coalesce(b.Category,a.category) then 'N' else 'Y' end ChangedCategory
from a full join b on a.code = b.code
order by code;
CodePudding user response:
It is possible that certain SQL implementations offer additional features specifically for a use case such as this. Having said that, I will offer a possible solution in bare SQL.
From this point on I will refer to Table A as LeftTable and Table B as RightTable, as it helps to reason about them from a Left/Right perspective, specifically with regards to the join operations.
The presence of the "Changed Category" column probably requires additional specifications. For example, I observe that B789(Orange) is CATG1 in both LeftTable and RightTable. But there is no mention of the possibility of an additional record for B789(Orange) that also has a value of CATG2 for its Category column in either LeftTable or RightTable. In this case, what would be the value of "Changed Category"? For now, I only provide a concatenation of the categories.
As far as bare SQL, a possible solution is a union of something like the following:
(Warning this script is untested as I do not have access to testing tools at the moment; I may be missing/forgetting key details)
SELECT Code, Description, Categories,
'Y' as 'Code in Left Table Only',
'N' as 'Code in Right Table Only',
'N' as 'Code in Both Tables'
FROM
LeftTable
WHERE
LeftTable.Code NOT IN (SELECT Code FROM RightTable) // only shows results present on left table based on Code column, in other words, only show LeftTable results if they have a match on RightTable
UNION
SELECT Code, Description, Categories,
'N' as 'Code in Left Table Only',
'Y' as 'Code in Right Table Only',
'N' as 'Code in Both Tables'
FROM
RightTable
WHERE
RightTable.Code NOT IN (SELECT Code FROM LeftTable) // only shows results present on left table based on Code column, in other words, only show LeftTable results if they have a match on RightTable
UNION
SELECT Code, Description, CONCAT(Category, ', ') as 'Categories', // I concatenate the multiple categories if they are present.
'N' as 'Code in Left Table Only',
'N' as 'Code in Right Table Only',
'Y' as 'Code in Both Tables',
FROM
LeftTable INNER JOIN RightTable ON LeftTable.Code = RightTable.Code
GROUP BY
Code, Description // since CONCAT is an aggregation function, all other non-fixed values must be grouped