By having a table with thousands of Villages of course there are duplicates. In order to use them as mapping table it's necessary to mark them accordingly.
Table looks as follows;
id | country | name | notUnique |
---|---|---|---|
1234567 | Peru | Arenal | null |
1234568 | Mexico | Arenal | null |
1234569 | Honduras | Arenal | null |
1234570 | USA | Pasadena | null |
1234571 | Canada | Pasadena | null |
1234572 | Philippines | Pasadeña | null |
1234573 | Poland | Susk Nowy | null |
now I use my code snippet to fill out the not Unique:
-- generate RowSum:
update
Mapping.dbo.OsmGlobal
set
notUnique = 1
from
Mapping.dbo.OsmGlobal osm
join (
SELECT [name], COUNT([name]) RowSum
FROM Mapping.dbo.OsmGlobal osm
Group by [name]
) cnt
on osm.[name] = cnt.[name]
where cnt.RowSum > 1
go
obviously it results in result bellow, as there is no case insensitivity;
id | country | name | notUnique |
---|---|---|---|
1234567 | Peru | Arenal | 1 |
1234568 | Mexico | Arenal | 1 |
1234569 | Honduras | Arenal | 1 |
1234570 | USA | Pasadena | 1 |
1234571 | Canada | Pasadena | 1 |
1234572 | Philippines | Pasadeña | null |
1234573 | Poland | Susk Nowy | null |
To include 'Pasadeña' in my result groups of the 'Pasadena' I tried to add a collate Latin1_General_CI_AI
. That is the point where I fail. Any help is highly welcome.
CodePudding user response:
I would use COLLATE
on the name
column, along with exists logic:
UPDATE Mapping.dbo.OsmGlobal o1
SET notUnique = 1
WHERE EXISTS (SELECT 1 FROM Mapping.dbo.OsmGlobal t2
WHERE t2.name COLLATE Latin1_General_CI_AI =
t1.name COLLATE Latin1_General_CI_AI AND
t2.id <> t1.id);
In plain English, the above query says to set the notUnique
flag to 1 for any record for which we can find another record (different id
) having the same name, as collated using Latin1_General_CI_AI
.
CodePudding user response:
As mentioned by @TimBiegeleisen, you should use a case-insensitive collation.
But you can do this more simply with an updatable CTE
WITH osm AS (
SELECT *,
COUNT(*) OVER (PARTITION BY (name COLLATE Latin1_General_CI_AI)) RowSum
FROM Mapping.dbo.OsmGlobal osm
)
UPDATE osm
SET
notUnique = 1
WHERE osm.RowSum > 1;