Home > Mobile >  Left Join with Distinct Right table returns more rows in SQL Server
Left Join with Distinct Right table returns more rows in SQL Server

Time:01-25

I have two tables, one is my Main table and one is Mapping table. The simplified tables look like this :

Main :

VALUE | CUSTNAME
123 | ADELE
323 | GORTH
242 | GORTH
345 | VIX
...

Mapping :

ISSUER | CATEGORY 
ADELE | A
GORTH | B
DENN | B
VIX | C
...

What I want to do is add a CATEGORY column to my Main table, so I to use a left join, but somehow it returns more rows than I originally have. And I can't check it one by one because it has around 30000 records. The increase it self is miniscule, 40 records, but it still a difference and it shows.

Things to note and things I already tried :

  • Main table can hold the same CUSTNAME while Mapping table is distinct, making it many-to-one relation.
  • Mapping table holds duplicate records, so I tried to select distinct it first before joining it
  • I already checked the NULL to see if the Mapping table miss anything, but it doesn't seems to be the case
  • I use both inner join and left join but it return the same result

So what did I do wrong here and how can I fix it?

My query :

SELECT A.*, B.CATEGORY AS CATEGORY
FROM Main A
LEFT JOIN 
    (SELECT DISTINCT * FROM Mapping) B ON A.CUSTNAME = B.Issuer

My output right now :

 VALUE | CUSTNAME | CATEGORY
    123 | ADELE | A
    323 | GORTH | B
    242 | GORTH | B
    345 | VIX | C
    ... with extra 40 records

My expected output :

 VALUE | CUSTNAME | CATEGORY
    123 | ADELE | A
    323 | GORTH | B
    242 | GORTH | B
    345 | VIX | C
    ... without extra 40 records

CodePudding user response:

You can do it as follows if you are looking to eliminate duplicates Mapping.Issuer ONLY.

SELECT A.*, B.CATEGORY AS CATEGORY
FROM Main A
LEFT JOIN 
    (SELECT Issuer, MAX(CATEGORY) AS CATEGORY FROM Mapping group by Issuer) B ON A.CUSTNAME = B.Issuer

Probably you have data on Mapping table like :

insert into Mapping values 
('ADELE','A'),
('GORTH','B'),
('DENN','B'),
('VIX','C'),
('VIX','D');

That means ('VIX','C'), ('VIX','D') are not duplicates

demo here

CodePudding user response:

This:

SELECT DISTINCT * FROM Mapping

won't prevent duplicates. Distinct * means that the tuple(combination) of ALL columns will be considered as unique; However, if there are two rows with same Issuer but a different value in any of the other columns, it will still return multiple rows. Like this:

Issuer | ManufactureId 
5623     894 
5623     895

This, in turn, will make one row from A turn into multiple from the left join.

  • Related