Imagine the following two CTEs:
- table_a shows the number of exempt students and the mean exam score (excluding exempt students) per category, per month.
- table_b shows a breakdown of each student information (using the month of January only as an example). Note that output in table_a is obtained through information of table_b.
As exempt students were under mitigating circumstances, their exam scores won't be considered. Instead, I would like to assign to them the mean score of all students that were examined as shown below:
What is the best way of achieving this? I'm struggling to see how to join the tables when for one of the table the timestamp is actually a truncated date as it provides grouped information.
with table_a (exam_month, category, "# exempt students", "mean score (excl. exempt students")
as (values
('2022-01-01 12:00:00'::timestamp, 'A', 2, 61.8),
('2022-02-01 12:00:00'::timestamp, 'A', 1, 63.1),
('2022-03-01 12:00:00'::timestamp, 'A', 2, 61.1),
('2022-01-01 12:00:00'::timestamp, 'B', 4, 73.8),
('2022-02-01 12:00:00'::timestamp, 'B', 3, 71.3),
('2022-03-01 12:00:00'::timestamp, 'B', 2, 72.5)
),
table_b (student_id, exam_timestamp, category, score, student_status)
as (values
(1, '2022-01-10 09:30:00'::timestamp, 'A', 13.1, 'exempt'),
(2, '2022-01-12 12:00:00'::timestamp, 'A', 63.7, 'examined'),
(3, '2022-01-14 09:00:00'::timestamp, 'A', 59.2, 'examined'),
(4, '2022-01-16 14:30:00'::timestamp, 'A', 62.8, 'examined'),
(5, '2022-01-18 12:30:00'::timestamp, 'A', 41.2, 'exempt'),
(6, '2022-01-20 12:00:00'::timestamp, 'A', 61.5, 'examined')
)
CodePudding user response:
You can join by using date_trunc()
and by casting to date
:
with table_a (exam_month, category, "# exempt students", "mean score (excl. exempt students")
as (
values
('2022-01-01 12:00:00'::timestamp, 'A', 2, 61.8),
('2022-02-01 12:00:00'::timestamp, 'A', 1, 63.1),
('2022-03-01 12:00:00'::timestamp, 'A', 2, 61.1),
('2022-01-01 12:00:00'::timestamp, 'B', 4, 73.8),
('2022-02-01 12:00:00'::timestamp, 'B', 3, 71.3),
('2022-03-01 12:00:00'::timestamp, 'B', 2, 72.5)
), table_b (student_id, exam_timestamp, category, score, student_status)
as (
values
(1, '2022-01-10 09:30:00'::timestamp, 'A', 13.1, 'exempt'),
(2, '2022-01-12 12:00:00'::timestamp, 'A', 63.7, 'examined'),
(3, '2022-01-14 09:00:00'::timestamp, 'A', 59.2, 'examined'),
(4, '2022-01-16 14:30:00'::timestamp, 'A', 62.8, 'examined'),
(5, '2022-01-18 12:30:00'::timestamp, 'A', 41.2, 'exempt'),
(6, '2022-01-20 12:00:00'::timestamp, 'A', 61.5, 'examined')
)
select b.student_id, b.exam_timestamp, b.category,
case
when b.student_status = 'exempt' then a."mean score (excl. exempt students"
else b.score
end as score,
b.student_status
from table_b b
join table_a a
on a.category = b.category
and date_trunc('month', b.exam_timestamp)::date = a.exam_month::date;
All that said, you do not need table_a
for this calculation:
with table_b (student_id, exam_timestamp, category, score, student_status)
as (
values
(1, '2022-01-10 09:30:00'::timestamp, 'A', 13.1, 'exempt'),
(2, '2022-01-12 12:00:00'::timestamp, 'A', 63.7, 'examined'),
(3, '2022-01-14 09:00:00'::timestamp, 'A', 59.2, 'examined'),
(4, '2022-01-16 14:30:00'::timestamp, 'A', 62.8, 'examined'),
(5, '2022-01-18 12:30:00'::timestamp, 'A', 41.2, 'exempt'),
(6, '2022-01-20 12:00:00'::timestamp, 'A', 61.5, 'examined')
)
select b.student_id, b.exam_timestamp, b.category,
(case
when b.student_status = 'exempt'
then sum(case
when b.student_status = 'exempt' then 0
else b.score
end) over w
/
sum((b.student_status = 'examined')::int) over w
else b.score
end)::numeric(4,1) as score,
b.student_status
from table_b b
window w as (partition by date_trunc('month', b.exam_timestamp), category)
;
db<>fiddle here