Home > Blockchain >  How to remove duplicates from second select using union sql
How to remove duplicates from second select using union sql

Time:04-06

I have two tables: outfits and reactions. I need to display the first 6 outfits from a specific user that has liked specific outfits in the last 48hours then the rest of the outfits. This works great but I get duplicates from the second query where I repeat these outfits again. I want to make sure outfit.id is unique.

How can I remove these duplicates?

(select outfit.id,
        outfit.title,
        ro.type,
        ro.outfit_id,
        ro.sub,
        ro.created_at as reaction_created_at,
        0 as priority
 from outfit
          left join reaction_outfit ro on outfit.id = ro.outfit_id
 where ro.sub = '123' and ro.created_at >= (NOW() - INTERVAL '48 hours')
 order by reaction_created_at desc nulls last LIMIT 6)
union
(select outfit.id,
        outfit.title,
        ro.type,
        ro.outfit_id,
        ro.sub,
        ro.created_at as reaction_created_at,
        1 as priority
 from outfit
          left join reaction_outfit ro on outfit.id = ro.outfit_id
 where is_public = true
 order by outfit.created_at desc)
order by priority, outfit_created_at desc;

Postgres version 13.2

CodePudding user response:

By its definition, the priority column will thwart the union's ability to remove duplicates, as the two halves of the union will never be duplicate. You may try using a DISTINCT ON approach here:

select *
from
(
select distinct on (id) *
from (
(select
     outfit.id,
     outfit.title,
     ro.type,
     ro.outfit_id,
     ro.sub,
     ro.created_at as reaction_created_at,
     0 as priority
 from outfit
 left join reaction_outfit ro on outfit.id = ro.outfit_id
 where ro.sub = '123' and ro.created_at >= (NOW() - INTERVAL '48 hours'
 order by reaction_created_at desc nulls last
 limit 6
)
union
(select
     outfit.id,
     outfit.title,
     ro.type,
     ro.outfit_id,
     ro.sub,
     ro.created_at,
     1
 from outfit
 left join reaction_outfit ro on outfit.id = ro.outfit_id
 where is_public = true
)
) t
order by id, priority
) t
order by priority, reaction_created_at desc;

CodePudding user response:

Perhaps I'm missing something, but is this as straightforward as adding distinct to solve your problem?

Edit - removed priority column as called out by Tim, as this will indeed prevent duplicates from being removed where they have different priorities.

    Select distinct * from(
(select outfit.id,
        outfit.title,
        ro.type,
        ro.outfit_id,
        ro.sub,
        ro.created_at as reaction_created_at
 from outfit
          left join reaction_outfit ro on outfit.id = ro.outfit_id
 where ro.sub = '123' and ro.created_at >= (NOW() - INTERVAL '48 hours')
 order by reaction_created_at desc nulls last LIMIT 6)
union
(select outfit.id,
        outfit.title,
        ro.type,
        ro.outfit_id,
        ro.sub,
        ro.created_at as reaction_created_at
 from outfit
          left join reaction_outfit ro on outfit.id = ro.outfit_id
 where is_public = true
 order by outfit.created_at desc))
order by outfit_created_at desc;
  • Related