Home > Net >  Delete duplicate entries while keeping one
Delete duplicate entries while keeping one

Time:12-15

I have a table but it has no unique ID or primary key.

It has 3 columns in total.

name user_id role_id
ben 1 2
ben 1 2
sam 1 3

I'd like to remove one entry with the name Ben.

So output would look like this

name user_id role_id
ben 1 2
sam 1 3

Most of the examples shows deleting duplicate entries with ID or primary key. However how would I retain one entry whilest removing the other ones?

Using the following query I was able to get duplicated rows

SELECT name, user_id, role_id, count(*) FROM some_table
GROUP BY name, user_id, role_id
HAVING count(*) > 1

To clarify, I am looking to delete these rows.

Prefer not creating a new table.

CodePudding user response:

If you don't have to worry about other users accessing the table -

CREATE TABLE `new_table` AS
SELECT DISTINCT `name`, `user_id`, `role_id`
FROM `old_table`;

RENAME TABLE
    `old_table` TO `backup`,
    `new_table` TO `old_table`;

Or you could use your duplicates query to output lots of single row delete queries -

SELECT
    `name`,
    `user_id`,
    `role_id`,
    COUNT(*),
    CONCAT('DELETE FROM some_table WHERE name=\'', `name`, '\' AND user_id=\'', `user_id`, '\' AND role_id=\'', `role_id`, '\' LIMIT 1;') AS `delete_stmt`
FROM `some_table`
GROUP BY `name`, `user_id`, `role_id`
HAVING COUNT(*) > 1;

Or you could temporarily add a SERIAL column and then remove it after the delete -

ALTER TABLE `some_table` ADD COLUMN `temp_id` SERIAL;

DELETE `t1`.* 
FROM `some_table` `t1`
LEFT JOIN (
    SELECT MIN(`temp_id`) `min_temp_id`
    FROM `some_table`
    GROUP BY `name`, `user_id`, `role_id`
) `t2` ON `t1`.`temp_id` = `t2`.`min_temp_id`
WHERE `t2`.`min_temp_id` IS NULL;

ALTER TABLE `some_table` DROP COLUMN `temp_id`;

CodePudding user response:

I would take the duplicate records and put them into another table.

SELECT  
   name, 
   user_id, 
   role_id
   INTO some_new_table
FROM some_table
GROUP BY name, user_id, role_id
HAVING count(*) > 1

Then you can delete those records from your source table

DELETE a
FROM some_table a
INNER JOIN some_new_table b
  ON a.name = b.name
    AND a.user_id = b.user_id
    AND a.role_id = b.role_id

Finally you can then insert the deduped records back into your table.

INSERT INTO some_table 
SELECT 
  name, 
  user_id, 
  role_id
FROM some_new_table 

If the volume of dupes is very large you could also just create a new table with the deduped data. Truncate \ Drop the old table and then Insert \ Rename from the new table.

  • Related