So a little context, I have 2 tables:
Table "Class" where I store a bunch of classes.
id serial4 // -> This is the PK
name varchar(64)
code varchar(64)
And Table "ClassEvent", where I store events related to classes, such as "started" and "ended".
id serial4
class_id int4 NOT NULL // -> This is a FK to the Class table
event_type varchar(1) NOT NULL // -> Here we store 's' for started and 'e' for ended.
Now, I need a query to get the amount of times each class has been started and ended. This works:
select
c.code,
c.name,
count(started.id) "started"
from "class" c
left join (select id, class_id, event_type from "class_event" where event_type = 's') started
on started.klass_id = c.id
group by c.code, c.name
order by started desc;
But when I do exactly the same to get the amount of ended classes it shows incorrect amounts:
select
c.code,
c.name,
count(started.id) "started",
count(ended.id) "ended"
from "class" c
left join (select id, class_id, event_type from "class_event" where event_type = 's') started
on started.klass_id = c.id
left join (select id, class_id, event_type from "class_event" where event_type = 'e') ended
on ended.klass_id = c.id
group by c.code, c.name
order by started desc;
Also, the query takes significantly more time to execute. Is there anything I'm missing?
CodePudding user response:
You can try to use condition aggregate function
select
c.code,
c.name,
count(CASE WHEN event_type = 's' THEN ended.id END) "started",
count(CASE WHEN event_type = 'e' THEN ended.id END) "ended"
from "class" c
left join "class_event" started
on started.class_id = c.id
group by c.code, c.name
order by started desc;
CodePudding user response:
While you query the whole table it's typically cleaner and faster to aggregate the n-table before you join. See:
SELECT *
FROM class c
LEFT JOIN (
SELECT class_id AS id
, count(*) FILTER (WHERE event_type = 's') AS started
, count(*) FILTER (WHERE event_type = 'e') AS ended
FROM class_event
GROUP BY 1
) e USING (id)
ORDER BY e.started DESC NULLS LAST;
NULLS LAST
because it's conceivable that some of the classes have no related rows in table class_event
(yet), and the resulting NULL
values surely shouldn't sort on top. See:
About the aggregate FILTER
clause:
- Aggregate columns with additional (distinct) filters
- For absolute performance, is SUM faster or COUNT?
Aside:
For just a hand full of allowed values, I would consider the data type "char"
instead of varchar(1)
for event_type
. See: