SQL: From a group by get another column on max date-CodePudding

So I have a table more less like this name it tests_summary:

test_id	completed	remaining	score_date
200	6	1	2022-05-02
200	2	7	2022-05-01
200	5	10	2022-04-29
200	5	15	2022-04-28
199	10	0	2022-05-02
199	2	10	2022-05-01

First I tried to group by each test_id, SUM every completed by each test, but I want to get in the remaining column, the remaining value in the most recent score date. Example test_id 200, the most recent date is 2022-05-02, and 1 is remaining that is the value I want. In this table is already ordered but in my real table is not.
Expected result

test_id	completed	remaining
200	18	1
199	12	0

CodePudding user response：

You can use row number function to give you a 1 for every item you care about.

Note, I use * here as a shorthand but you should only select the columns you are interested in, in your final code.

SELECT * 
FROM (
  SELECT z.*, ROW_NUMBER(PARTITION BY test_id ORDER BY score_date DESC) AS RN
  FROM sometableyoudidnotname z
) X
WHERE X.RN = 1

CodePudding user response：

You can use

SELECT test_id, completed, remaining
  FROM (SELECT test_id,
               remaining,
               SUM(completed) OVER(PARTITION BY test_id) AS completed,               
               ROW_NUMBER() OVER(PARTITION BY test_id ORDER BY score_date DESC) AS rn
          FROM t) tt
 WHERE rn = 1

where you can get the latest date through descendingly sorted by the score_date while partitioning by test_id in order to group by each of them provided that your database and its version is suitable to use a window(or analytic) function

CodePudding user response：

Upd. That solution for PostgreSQL only. Thanks Hogan for correcting me

I'm not sure that my solution will be the most optimal for performance, but nevertheless it works. The main idea is to use the DISTINCT ON construct on the "remaining" column and separately summarize the "completed" column with a subquery:

WITH
    t AS (SELECT 200                AS test_id,
                 0                  AS completed,
                 1                  AS remaining,
                 '2022-05-02'::DATE AS score_date
          UNION
          SELECT 200,
                 2,
                 3,
                 '2022-05-01'
          UNION
          SELECT 199,
                 10,
                 0,
                 '2022-05-02'
          UNION
          SELECT 199,
                 2,
                 10,
                 '2022-05-01')
SELECT
    distinct on (t1.test_id)
        t1.remaining,
        t1.test_id,
        (select sum(completed) from t where t.test_id = t1.test_id)
    FROM t t1
ORDER BY t1.test_id,t1.score_date desc;

output result