Home > other >  Removing duplicated row having same keys in different column
Removing duplicated row having same keys in different column

Time:11-02

I am writing a code to look for duplicated Sub_brand_Descriptions in the same table having the same brand_code. I was able to produce the table below (which is what I need) however the first row and second row is the same because I am joining the table with itself. Is there a way to remove any of the duplicated row as distinct will not work (Different sub-brand id in the same column but because it appears in the row below in a different column, technically they are the same)?

select distinct
brands.BRAND_ID as Brand_Code,
sub.SUB_BRAND_ID as Sub_Brand_ID1,
sub.SUB_BRAND as Sub_Brand_Descrption1,
sub2.SUB_BRAND_ID as Sub_Brand_ID2,
sub2.SUB_BRAND as Sub_Brand_Descrption2

from table1 as brands
inner join table2 as sub on sub.BRAND_ID = brands.BRAND_ID and sub.LANGU = 'E'
inner join table2 as sub2 on sub2.SUB_BRAND = sub.SUB_BRAND and sub2.LANGU = 'E'

where sub.SUB_BRAND_ID != sub2.SUB_BRAND_ID and sub.BRAND_ID = sub2.BRAND_ID
Brand_Code Sub_Brand_ID1 Sub_Brand_Descrption1 Sub_Brand_ID2 Sub_Brand_Descrption2
ABC X123 X123ABC Y123 X123ABC
ABC Y123 X123ABC X123 X123ABC

Desired output:

Brand_Code Sub_Brand_ID1 Sub_Brand_Descrption1 Sub_Brand_ID2 Sub_Brand_Descrption2
ABC X123 X123ABC Y123 X123ABC

Source data: Table 1:

Brand_Code label
ABC 1
CDE 1
EFG 2

source Table 2:

Brand_Code Sub_Brand_ID Sub_Brand Language
ABC X123 X123ABC E
ABC Y123 X123ABC E
BBC X223 H23ABC E
BBC Y223 H23ABC E

CodePudding user response:

add a function for row_number in your query and filter it for 1, see example below: (you can change order_by_clause as per your requirement)

with cte as (

select distinct
brands.BRAND_ID as Brand_Code,
sub.SUB_BRAND_ID as Sub_Brand_ID1,
sub.SUB_BRAND as Sub_Brand_Descrption1,
sub2.SUB_BRAND_ID as Sub_Brand_ID2,
sub2.SUB_BRAND as Sub_Brand_Descrption2,
row_number() over (partition by Sub_Brand_Descrption1 order by Sub_Brand_ID1) as rn

from table1 as brands
inner join table2 as sub on sub.BRAND_ID = brands.BRAND_ID and sub.LANGU = 'E'
inner join table2 as sub2 on sub2.SUB_BRAND = sub.SUB_BRAND and sub2.LANGU = 'E'

where sub.SUB_BRAND_ID != sub2.SUB_BRAND_ID and sub.BRAND_ID = sub2.BRAND_ID

)

Select * from cte where rn=1;

CodePudding user response:

You don't have to join twice, once and pivot the result will do

SELECT 
  brands.brand_id as Brand_Code,
  MAX(CASE WHEN rn = 1 THEN sub.sub_brand_id END) as Sub_Brand_ID1,
  MAX(CASE WHEN rn = 1 THEN sub.sub_brand END) as Sub_Brand_Descrption1,
  MAX(CASE WHEN rn = 2 THEN sub.sub_brand_id END) as Sub_Brand_ID2,
  MAX(CASE WHEN rn = 2 THEN sub.sub_brand END) as Sub_Brand_Descrption2

from 
  table1 brands
  INNER JOIN (SELECT *, row_number() over (partition by Brand_Code order by sub_brand_id) as rn from table2) sub on sub.brand_code = brands.brand_code and sub.langu = 'E'
  
  

GROUP BY brands.brand_id

I'm presuming you want other columns out of table1, otherwise ditch it and do the whole query out of t2, without a join:

SELECT 
  sub.brand_id as Brand_Code,
  MAX(CASE WHEN rn = 1 THEN sub.sub_brand_id END) as Sub_Brand_ID1,
  MAX(CASE WHEN rn = 1 THEN sub.sub_brand END) as Sub_Brand_Descrption1,
  MAX(CASE WHEN rn = 2 THEN sub.sub_brand_id END) as Sub_Brand_ID2,
  MAX(CASE WHEN rn = 2 THEN sub.sub_brand END) as Sub_Brand_Descrption2

from 
  (SELECT *, row_number() over (partition by Brand_Code order by sub_brand_id) as rn from table2) sub
WHERE
  sub.langu = 'E'
  
  

GROUP BY sub.brand_id
  • Related