Removing duplicated row having same keys in different column-CodePudding

I am writing a code to look for duplicated Sub_brand_Descriptions in the same table having the same brand_code. I was able to produce the table below (which is what I need) however the first row and second row is the same because I am joining the table with itself. Is there a way to remove any of the duplicated row as distinct will not work (Different sub-brand id in the same column but because it appears in the row below in a different column, technically they are the same)?

select distinct
brands.BRAND_ID as Brand_Code,
sub.SUB_BRAND_ID as Sub_Brand_ID1,
sub.SUB_BRAND as Sub_Brand_Descrption1,
sub2.SUB_BRAND_ID as Sub_Brand_ID2,
sub2.SUB_BRAND as Sub_Brand_Descrption2

from table1 as brands
inner join table2 as sub on sub.BRAND_ID = brands.BRAND_ID and sub.LANGU = 'E'
inner join table2 as sub2 on sub2.SUB_BRAND = sub.SUB_BRAND and sub2.LANGU = 'E'

where sub.SUB_BRAND_ID != sub2.SUB_BRAND_ID and sub.BRAND_ID = sub2.BRAND_ID

Brand_Code	Sub_Brand_ID1	Sub_Brand_Descrption1	Sub_Brand_ID2	Sub_Brand_Descrption2
ABC	X123	X123ABC	Y123	X123ABC
ABC	Y123	X123ABC	X123	X123ABC

Desired output:

Brand_Code	Sub_Brand_ID1	Sub_Brand_Descrption1	Sub_Brand_ID2	Sub_Brand_Descrption2
ABC	X123	X123ABC	Y123	X123ABC

Source data: Table 1:

Brand_Code	label
ABC	1
CDE	1
EFG	2

source Table 2:

Brand_Code	Sub_Brand_ID	Sub_Brand	Language
ABC	X123	X123ABC	E
ABC	Y123	X123ABC	E
BBC	X223	H23ABC	E
BBC	Y223	H23ABC	E

CodePudding user response：

add a function for row_number in your query and filter it for 1, see example below: (you can change order_by_clause as per your requirement)

with cte as (

select distinct
brands.BRAND_ID as Brand_Code,
sub.SUB_BRAND_ID as Sub_Brand_ID1,
sub.SUB_BRAND as Sub_Brand_Descrption1,
sub2.SUB_BRAND_ID as Sub_Brand_ID2,
sub2.SUB_BRAND as Sub_Brand_Descrption2,
row_number() over (partition by Sub_Brand_Descrption1 order by Sub_Brand_ID1) as rn

from table1 as brands
inner join table2 as sub on sub.BRAND_ID = brands.BRAND_ID and sub.LANGU = 'E'
inner join table2 as sub2 on sub2.SUB_BRAND = sub.SUB_BRAND and sub2.LANGU = 'E'

where sub.SUB_BRAND_ID != sub2.SUB_BRAND_ID and sub.BRAND_ID = sub2.BRAND_ID

)

Select * from cte where rn=1;

CodePudding user response：

You don't have to join twice, once and pivot the result will do

SELECT 
  brands.brand_id as Brand_Code,
  MAX(CASE WHEN rn = 1 THEN sub.sub_brand_id END) as Sub_Brand_ID1,
  MAX(CASE WHEN rn = 1 THEN sub.sub_brand END) as Sub_Brand_Descrption1,
  MAX(CASE WHEN rn = 2 THEN sub.sub_brand_id END) as Sub_Brand_ID2,
  MAX(CASE WHEN rn = 2 THEN sub.sub_brand END) as Sub_Brand_Descrption2

from 
  table1 brands
  INNER JOIN (SELECT *, row_number() over (partition by Brand_Code order by sub_brand_id) as rn from table2) sub on sub.brand_code = brands.brand_code and sub.langu = 'E'
  
  

GROUP BY brands.brand_id

I'm presuming you want other columns out of table1, otherwise ditch it and do the whole query out of t2, without a join:

SELECT 
  sub.brand_id as Brand_Code,
  MAX(CASE WHEN rn = 1 THEN sub.sub_brand_id END) as Sub_Brand_ID1,
  MAX(CASE WHEN rn = 1 THEN sub.sub_brand END) as Sub_Brand_Descrption1,
  MAX(CASE WHEN rn = 2 THEN sub.sub_brand_id END) as Sub_Brand_ID2,
  MAX(CASE WHEN rn = 2 THEN sub.sub_brand END) as Sub_Brand_Descrption2

from 
  (SELECT *, row_number() over (partition by Brand_Code order by sub_brand_id) as rn from table2) sub
WHERE
  sub.langu = 'E'
  
  

GROUP BY sub.brand_id