I have two tables, one is my Main
table and one is Mapping
table. The simplified tables look like this :
Main :
VALUE | CUSTNAME
123 | ADELE
323 | GORTH
242 | GORTH
345 | VIX
...
Mapping :
ISSUER | CATEGORY
ADELE | A
GORTH | B
DENN | B
VIX | C
...
What I want to do is add a CATEGORY
column to my Main
table, so I to use a left join, but somehow it returns more rows than I originally have. And I can't check it one by one because it has around 30000 records. The increase it self is miniscule, 40 records, but it still a difference and it shows.
Things to note and things I already tried :
Main
table can hold the sameCUSTNAME
whileMapping
table is distinct, making it many-to-one relation.Mapping
table holds duplicate records, so I tried to select distinct it first before joining it- I already checked the NULL to see if the
Mapping
table miss anything, but it doesn't seems to be the case - I use both inner join and left join but it return the same result
So what did I do wrong here and how can I fix it?
My query :
SELECT A.*, B.CATEGORY AS CATEGORY
FROM Main A
LEFT JOIN
(SELECT DISTINCT * FROM Mapping) B ON A.CUSTNAME = B.Issuer
My output right now :
VALUE | CUSTNAME | CATEGORY
123 | ADELE | A
323 | GORTH | B
242 | GORTH | B
345 | VIX | C
... with extra 40 records
My expected output :
VALUE | CUSTNAME | CATEGORY
123 | ADELE | A
323 | GORTH | B
242 | GORTH | B
345 | VIX | C
... without extra 40 records
CodePudding user response:
You can do it as follows if you are looking to eliminate duplicates Mapping.Issuer ONLY.
SELECT A.*, B.CATEGORY AS CATEGORY
FROM Main A
LEFT JOIN
(SELECT Issuer, MAX(CATEGORY) AS CATEGORY FROM Mapping group by Issuer) B ON A.CUSTNAME = B.Issuer
Probably you have data on Mapping table like :
insert into Mapping values
('ADELE','A'),
('GORTH','B'),
('DENN','B'),
('VIX','C'),
('VIX','D');
That means ('VIX','C'), ('VIX','D')
are not duplicates
CodePudding user response:
This:
SELECT DISTINCT * FROM Mapping
won't prevent duplicates. Distinct *
means that the tuple(combination) of ALL columns will be considered as unique; However, if there are two rows with same Issuer but a different value in any of the other columns, it will still return multiple rows. Like this:
Issuer | ManufactureId
5623 894
5623 895
This, in turn, will make one row from A turn into multiple from the left join
.