Home > Mobile >  Join 2 tables where one table may or may not have an entry
Join 2 tables where one table may or may not have an entry

Time:09-25

i have 2 tables (person activities)

Person

id name
1 John
2 Axel
3 William

Activities

activity_id person_id activity_type
1 1 Login
2 1 Visited Website
3 1 Logout
4 3 Login
5 3 Logout

As you can see John and William have both several activities. But Axel has no activities at all.

The result i try to achieve is as follows. I want to select id and name from every entry of the person table and the activity_id and activity_type from the activity table.

If the person has no activities yet, the id and name of the person should still be shown. And if the person has more then one activity, only the one with the highest id should be shown.

The result i aim for:

id name activity_id activity_type
1 John 3 Logout
2 Axel null null
3 William 5 Logout

When i try a left join:

select p.id, p.name, a.activity_id, a.activity_type
from person p left join activity a on p.id = a.person_id
order by p.id

i get this result:

id name activity_id activity_type
1 John 1 Login
1 John 2 Visited Website
1 John 3 Logout
2 Axel null null
3 William 4 Login
3 William 5 Logout

But as i only one want one entry per person i added the following where-clause:

select p.id, p.name, a.activity_id, a.activity_type
from person p left join activity a on p.id = a.person_id
where a.id = (select max(id) from activity a2 where a2.person_id = p.id)
order by p.id

This is the result:

id name activity_id activity_type
1 John 3 Logout
3 William 5 Logout

The entry for John and William are as i wanted. Only the 'last' activity is shown. The problem is Axel is not shown anymore.

Any help appriciated. Many thanks in advance!

CodePudding user response:

Just another option you could use, which might be more performant with the correct indexes in place, to use a window function to identify the activity per person

select p.id, p.name, a.activity_id, a.activity_type
from person p 
left join (
    select *, Row_Number() over(partition by person_id order by activity_id desc) rn
    from activities
)a on a.person_id=p.id and a.rn=1

CodePudding user response:

In Postgres the simplest and fastest method is usually to use a Postgres extension to SQL, distinct on:

select distinct on (p.id) p.id, p.name, a.activity_id, a.activity_type
from person p left join
     activity a
     on p.id = a.person_id
order by p.id, a.activity_id desc;

distinct on returns one row for keys specified in the parentheses. The specific row is determining by the order by.

  • Related