How to query for delete the same data out of the table?-CodePudding

I want to delete it for the output to look like this. In one group there should be only 1 user_id.

select distinct group_id, user_id, count(*)
from pin_users
where group_id in (select group_id from pin_users)
group by group_id, user_id
having count(*) > 1

I get all user_id, group_id and count more than 1 but I don't know how to delete duplicates and leave only 1 left.

Ps. My English is probably not perfect, pls excuse any mistakes

CodePudding user response：

Make a subquery to get a list of minimum ids for any combination of users and groups. Then remove everything else.

DELETE FROM pin_users WHERE id NOT IN (
    SELECT min(id) as id
    FROM pin_users
    GROUP BY group_id, user_id
)

CodePudding user response：

In case of sql-server, try this:

-- mockup data
declare @Raw table (
    id nvarchar(max),
    group_id nvarchar(max),
    user_id nvarchar(max)
                   )

insert into @Raw values ('p001', 'aaa', 'ava001'), ('p002', 'aaa', 'ava001'), ('p003', 'bbb', 'ava001'), ('p004', 'bbb', 'ava001');

-- check this query
with A as (
    select id, ROW_NUMBER() over(partition by group_id, user_id order by id) as RN 
from @Raw)
delete from @Raw where id in (select id from A where A.RN > 1)

-- verify table
select * from @Raw

CodePudding user response：

For example, with sql server should be something like this

delete u1
from pin_users u1
 join (
   select min(id) id, group_id, user_id
   from pin_users
   group by group_id, user_id
   ) u2
   on u1.group_id = u2.group_id
  and u1.user_id = u2.user_id
  and u1.id <> u2.id