Home > OS >  How to drop duplicated rows on SQL Server
How to drop duplicated rows on SQL Server

Time:03-29

I have a table that looks like this: enter image description here

The problem is that down below the table there are a numbers that repeat itself. What I want to do is simple drop those duplicated numbers, so I could create a primary key.

I tried to do something like this:

SELECT DISTINCT geolocation_zip_code_prefix
INTO geolocation_
FROM geolocation;

SELECT  a.geolocation_zip_code_prefix, b.geolocation_city, b.geolocation_lat, b.geolocation_lng, b.geolocation_state
INTO geolocation_dist
FROM geolocation_ AS a
INNER JOIN geolocation AS b
ON a.geolocation_zip_code_prefix = b.geolocation_zip_code_prefix;

With that, I'd have a non duplicated table named geolocation_dist. But when I do that, the geolocation_dist table keeps all the rows. I don't know, maybe I'm doing something wrong in the JOIN. Does anyone know what I'm doing wrong or know another way to drop the duplicates?

CodePudding user response:

If you only want to select with unique zipcodes.
Then you could use the ROW_NUMBER = 1 trick for that.

WITH CTE AS (
  SELECT *
  , rn = ROW_NUMBER() OVER (PARTITION BY geolocation_zip_code_prefix ORDER BY geolocation_lat) 
  FROM geolocation
) 
SELECT 
  geolocation_zip_code_prefix
, geolocation_city
, geolocation_lat
, geolocation_lng
, geolocation_state
FROM CTE
WHERE rn = 1;

And here's a simplified test snippet to show that inner joining a table with unique values doesn't make the result unique.

declare @Table1 table (Col1 int);
declare @Table2 table (Col2 int);
insert into @Table1 (Col1) values (1),(2),(2),(3),(3),(3);
insert into @Table2 (Col2) select distinct Col1 from @Table1;

select * from @Table2;

select *
from @Table1 t1
join @Table2 t2 on t2.Col2 = t1.Col1;
  • Related