Home > OS >  PostgreSQL: Replace value of one table by mean value of another table when criteria is met
PostgreSQL: Replace value of one table by mean value of another table when criteria is met

Time:07-09

Imagine the following two CTEs:

  • table_a shows the number of exempt students and the mean exam score (excluding exempt students) per category, per month.

enter image description here

  • table_b shows a breakdown of each student information (using the month of January only as an example). Note that output in table_a is obtained through information of table_b.

enter image description here

As exempt students were under mitigating circumstances, their exam scores won't be considered. Instead, I would like to assign to them the mean score of all students that were examined as shown below:

enter image description here

What is the best way of achieving this? I'm struggling to see how to join the tables when for one of the table the timestamp is actually a truncated date as it provides grouped information.

with table_a (exam_month, category, "# exempt students", "mean score (excl. exempt students")
as (values 
('2022-01-01 12:00:00'::timestamp, 'A', 2, 61.8),
('2022-02-01 12:00:00'::timestamp, 'A', 1, 63.1),
('2022-03-01 12:00:00'::timestamp, 'A', 2, 61.1),
('2022-01-01 12:00:00'::timestamp, 'B', 4, 73.8),
('2022-02-01 12:00:00'::timestamp, 'B', 3, 71.3),
('2022-03-01 12:00:00'::timestamp, 'B', 2, 72.5)
),

table_b (student_id, exam_timestamp, category, score, student_status)
as (values
(1, '2022-01-10 09:30:00'::timestamp, 'A', 13.1, 'exempt'),
(2, '2022-01-12 12:00:00'::timestamp, 'A', 63.7, 'examined'),
(3, '2022-01-14 09:00:00'::timestamp, 'A', 59.2, 'examined'),
(4, '2022-01-16 14:30:00'::timestamp, 'A', 62.8, 'examined'),
(5, '2022-01-18 12:30:00'::timestamp, 'A', 41.2, 'exempt'),
(6, '2022-01-20 12:00:00'::timestamp, 'A', 61.5, 'examined')
)

CodePudding user response:

You can join by using date_trunc() and by casting to date:

with table_a (exam_month, category, "# exempt students", "mean score (excl. exempt students")
as (
 values 
  ('2022-01-01 12:00:00'::timestamp, 'A', 2, 61.8),
  ('2022-02-01 12:00:00'::timestamp, 'A', 1, 63.1),
  ('2022-03-01 12:00:00'::timestamp, 'A', 2, 61.1),
  ('2022-01-01 12:00:00'::timestamp, 'B', 4, 73.8),
  ('2022-02-01 12:00:00'::timestamp, 'B', 3, 71.3),
  ('2022-03-01 12:00:00'::timestamp, 'B', 2, 72.5)
), table_b (student_id, exam_timestamp, category, score, student_status)
as (
 values
  (1, '2022-01-10 09:30:00'::timestamp, 'A', 13.1, 'exempt'),
  (2, '2022-01-12 12:00:00'::timestamp, 'A', 63.7, 'examined'),
  (3, '2022-01-14 09:00:00'::timestamp, 'A', 59.2, 'examined'),
  (4, '2022-01-16 14:30:00'::timestamp, 'A', 62.8, 'examined'),
  (5, '2022-01-18 12:30:00'::timestamp, 'A', 41.2, 'exempt'),
  (6, '2022-01-20 12:00:00'::timestamp, 'A', 61.5, 'examined')
)
select b.student_id, b.exam_timestamp, b.category, 
       case 
         when b.student_status = 'exempt' then a."mean score (excl. exempt students"
         else b.score
       end as score,
       b.student_status
  from table_b b
       join table_a a
         on a.category = b.category
        and date_trunc('month', b.exam_timestamp)::date = a.exam_month::date;

All that said, you do not need table_a for this calculation:

with table_b (student_id, exam_timestamp, category, score, student_status)
as (
 values
  (1, '2022-01-10 09:30:00'::timestamp, 'A', 13.1, 'exempt'),
  (2, '2022-01-12 12:00:00'::timestamp, 'A', 63.7, 'examined'),
  (3, '2022-01-14 09:00:00'::timestamp, 'A', 59.2, 'examined'),
  (4, '2022-01-16 14:30:00'::timestamp, 'A', 62.8, 'examined'),
  (5, '2022-01-18 12:30:00'::timestamp, 'A', 41.2, 'exempt'),
  (6, '2022-01-20 12:00:00'::timestamp, 'A', 61.5, 'examined')
)
select b.student_id, b.exam_timestamp, b.category, 
       (case 
         when b.student_status = 'exempt' 
           then sum(case
                      when b.student_status = 'exempt' then 0
                      else b.score
                    end) over w
                / 
                sum((b.student_status = 'examined')::int) over w
         else b.score
       end)::numeric(4,1) as score,
       b.student_status
  from table_b b
window w as (partition by date_trunc('month', b.exam_timestamp), category)
;

db<>fiddle here

  • Related