Home > Net >  Delete duplicate rows but keep one without having a uniqe column
Delete duplicate rows but keep one without having a uniqe column

Time:11-18

I have a table with duplicate entries which I want to remove, keeping none but one of the duplicate entries. As you can see, they are exactly the same in every colum, there's no way to differentiate them:

enter image description here

I've used this query to figure out how many duplicates I have:

select url_rewrite_id, category_id, product_id, count(*) cnt
from catalog_url_rewrite_product_category
group by url_rewrite_id, category_id, product_id
having cnt > 1
order by cnt desc

I could use a variant of this to remove all duplicates:

delete
from catalog_url_rewrite_product_category
where url_rewrite_id in (
    select url_rewrite_id
    from catalog_url_rewrite_product_category
    group by url_rewrite_id, category_id, product_id
    having count(*) > 1
)

The problem I have with this is it would remove all entries that are duplicates and wouldn't keep the last one.

Earlier questions (here and here) assume a unique id column of sorts which is not the case with the data structure I have.

CodePudding user response:

Have you tried one of the solutions I read in the thread you shared (here)?

ALTER IGNORE TABLE jobs
ADD UNIQUE INDEX idx_name (site_id, title, company);

I think this would work.

  • Related