From a table like this:
id | status | date | category |
---|---|---|---|
1 | PENDING | 2022-07-01 | XYZ |
2 | DONE | 2022-07-04 | XYZ |
3 | PENDING | 2022-07-03 | DEF |
4 | DONE | 2022-07-08 | DEF |
I would like to get the most recent records within each category (here 2 and 4). But there are at least two factors that complicate things.
First, there might be more than two records in the same category. (The records come in pairs.)
id | status | date | category |
---|---|---|---|
1 | PENDING | 2022-07-01 | XYZ |
2 | PENDING | 2022-07-02 | XYZ |
3 | FAILED | 2022-07-04 | XYZ |
4 | FAILED | 2022-07-05 | XYZ |
5 | PENDING | 2022-07-03 | DEF |
6 | DONE | 2022-07-08 | DEF |
In this case, I'd have to get 3, 4, and 6. Were there six records in the XYZ category, I'd have to get the most recent three.
And, secondly, the date could be the same for the most recent records within a category.
I tried something like this:
WITH temp AS (
SELECT *,
dense_rank() OVER (PARTITION BY category ORDER BY date DESC) rnk
FROM tbl
)
SELECT *
FROM temp
WHERE rnk = 1;
But this fails when there are more than 2 records in a category and I need to get the most recent two.
EDIT:
Eli Johnson has pointed out in a comment that there should be information about which messages are pairs. Of course! I digged around a bit, and after a join or two there is.
id | status | date | category | prev_id |
---|---|---|---|---|
1 | PENDING | 2022-07-01 | XYZ | {} |
2 | PENDING | 2022-07-02 | XYZ | {} |
3 | FAILED | 2022-07-04 | XYZ | {1} |
4 | FAILED | 2022-07-05 | XYZ | {2} |
5 | PENDING | 2022-07-03 | DEF | {} |
6 | DONE | 2022-07-08 | DEF | {5} |
CodePudding user response:
The requirements are more hard-coded here then following proper design. Based on what has been proposed in the question, I just tweaked it a little bit to get last records.
Assuming that records are always in pair, as mentioned in the question.
WITH temp AS (
SELECT *,
row_number() OVER (PARTITION BY category ORDER BY date1 DESC) rnk,
count(1) over (partition by category) cnt
FROM status
)
SELECT *
FROM temp
WHERE rnk*2 <= cnt;
Refer fiddle here.