I get database table which contain postal numbers and regions for my country. That table have all information but i need to change it for my purpose.
I need to eliminate all rows that have duplicate content in specific column.
Check screenshot to see result
I want to remove all duplicate rows which have postanski_broj (postal_number) the some. That number need to be unique. I try manualy to set that column to unique but i get duplicate entry when i try to execute statment.
- ID is primary key with auto increment.
- postanski_broj column is VARCHAR which represent postal_code
- naselje column is VARCHAR which represent region
One region can have one postal_code
I try
ALTER TABLE poste ADD UNIQUE INDEX idx_postanski_br (postanski_broj);
00:03:20 ALTER TABLE poste ADD UNIQUE INDEX idx_postanski_br (postanski_broj) Error Code: 1062. Duplicate entry '11158' for key 'idx_postanski_br' 0.118 sec
ALTER IGNORE TABLE poste ADD UNIQUE INDEX idx_postanski_br (postanski_broj);
00:04:17 ALTER IGNORE TABLE poste ADD UNIQUE INDEX idx_postanski_br (postanski_broj) Error Code: 1064. You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IGNORE TABLE poste ADD UNIQUE INDEX idx_postanski_br (postanski_broj)' at line 1 0.00037 sec
Anyone have sugestion? Thanks
CodePudding user response:
If you have other columns with different values than the ones you've shown there (except for id
), deleting should be your last choice.
I usually would duplicate the table first:
CREATE TABLE poste_new LIKE poste;
add unique index to the newly created poste_new
table:
ALTER TABLE poste_new ADD UNIQUE INDEX idx_postanski_br (postanski_broj);
insert the data from poste
into poste_new
with IGNORE
option to skip duplicates based on the unique index:
INSERT IGNORE INTO poste_new SELECT * FROM poste;
rename the tables:
RENAME TABLE poste TO poste_old;
RENAME TABLE poste_new TO poste;
The good thing about this is that you've minimized the risk of wrong delete and if you're not satisfied with the new table, you still have the old table intact - effectively making it a backup.
CodePudding user response:
This solution can take too much time for big tables. Best way of solving this is: Remove duplicate rows in MySQL
You have to delete the rows before applying the unique constraint. Be careful applying this:
DELETE p1 FROM poste p1
INNER JOIN poste p2
WHERE
p1.id < p2.id AND
p1.postanski_broj = p2.postanski_broj;
This should remove the duplicated ones and will keep only the ones with the higher id (id=168044 in your example).