I am writing a code to look for duplicated Sub_brand_Descriptions in the same table having the same brand_code. I was able to produce the table below (which is what I need) however the first row and second row is the same because I am joining the table with itself. Is there a way to remove any of the duplicated row as distinct will not work (Different sub-brand id in the same column but because it appears in the row below in a different column, technically they are the same)?
select distinct
brands.BRAND_ID as Brand_Code,
sub.SUB_BRAND_ID as Sub_Brand_ID1,
sub.SUB_BRAND as Sub_Brand_Descrption1,
sub2.SUB_BRAND_ID as Sub_Brand_ID2,
sub2.SUB_BRAND as Sub_Brand_Descrption2
from table1 as brands
inner join table2 as sub on sub.BRAND_ID = brands.BRAND_ID and sub.LANGU = 'E'
inner join table2 as sub2 on sub2.SUB_BRAND = sub.SUB_BRAND and sub2.LANGU = 'E'
where sub.SUB_BRAND_ID != sub2.SUB_BRAND_ID and sub.BRAND_ID = sub2.BRAND_ID
Brand_Code | Sub_Brand_ID1 | Sub_Brand_Descrption1 | Sub_Brand_ID2 | Sub_Brand_Descrption2 |
---|---|---|---|---|
ABC | X123 | X123ABC | Y123 | X123ABC |
ABC | Y123 | X123ABC | X123 | X123ABC |
Desired output:
Brand_Code | Sub_Brand_ID1 | Sub_Brand_Descrption1 | Sub_Brand_ID2 | Sub_Brand_Descrption2 |
---|---|---|---|---|
ABC | X123 | X123ABC | Y123 | X123ABC |
Source data: Table 1:
Brand_Code | label |
---|---|
ABC | 1 |
CDE | 1 |
EFG | 2 |
source Table 2:
Brand_Code | Sub_Brand_ID | Sub_Brand | Language |
---|---|---|---|
ABC | X123 | X123ABC | E |
ABC | Y123 | X123ABC | E |
BBC | X223 | H23ABC | E |
BBC | Y223 | H23ABC | E |
CodePudding user response:
add a function for row_number in your query and filter it for 1, see example below: (you can change order_by_clause as per your requirement)
with cte as (
select distinct
brands.BRAND_ID as Brand_Code,
sub.SUB_BRAND_ID as Sub_Brand_ID1,
sub.SUB_BRAND as Sub_Brand_Descrption1,
sub2.SUB_BRAND_ID as Sub_Brand_ID2,
sub2.SUB_BRAND as Sub_Brand_Descrption2,
row_number() over (partition by Sub_Brand_Descrption1 order by Sub_Brand_ID1) as rn
from table1 as brands
inner join table2 as sub on sub.BRAND_ID = brands.BRAND_ID and sub.LANGU = 'E'
inner join table2 as sub2 on sub2.SUB_BRAND = sub.SUB_BRAND and sub2.LANGU = 'E'
where sub.SUB_BRAND_ID != sub2.SUB_BRAND_ID and sub.BRAND_ID = sub2.BRAND_ID
)
Select * from cte where rn=1;
CodePudding user response:
You don't have to join twice, once and pivot the result will do
SELECT
brands.brand_id as Brand_Code,
MAX(CASE WHEN rn = 1 THEN sub.sub_brand_id END) as Sub_Brand_ID1,
MAX(CASE WHEN rn = 1 THEN sub.sub_brand END) as Sub_Brand_Descrption1,
MAX(CASE WHEN rn = 2 THEN sub.sub_brand_id END) as Sub_Brand_ID2,
MAX(CASE WHEN rn = 2 THEN sub.sub_brand END) as Sub_Brand_Descrption2
from
table1 brands
INNER JOIN (SELECT *, row_number() over (partition by Brand_Code order by sub_brand_id) as rn from table2) sub on sub.brand_code = brands.brand_code and sub.langu = 'E'
GROUP BY brands.brand_id
I'm presuming you want other columns out of table1, otherwise ditch it and do the whole query out of t2, without a join:
SELECT
sub.brand_id as Brand_Code,
MAX(CASE WHEN rn = 1 THEN sub.sub_brand_id END) as Sub_Brand_ID1,
MAX(CASE WHEN rn = 1 THEN sub.sub_brand END) as Sub_Brand_Descrption1,
MAX(CASE WHEN rn = 2 THEN sub.sub_brand_id END) as Sub_Brand_ID2,
MAX(CASE WHEN rn = 2 THEN sub.sub_brand END) as Sub_Brand_Descrption2
from
(SELECT *, row_number() over (partition by Brand_Code order by sub_brand_id) as rn from table2) sub
WHERE
sub.langu = 'E'
GROUP BY sub.brand_id