I am working on partitioned tables(snapshots) of different time ranges:
1st input table:
Case | Stage | Date |
---|---|---|
A | 1 | 2022-08-01 |
B | 1 | 2022-08-01 |
C | 2 | 2022-08-01 |
2nd Input Table:
Case | Stage | Date |
---|---|---|
A | 1 | 2022-08-03 |
B | 2 | 2022-08-03 |
C | 2 | 2022-08-03 |
3rd Input Table:
Case | Stage | Date |
---|---|---|
A | 2 | 2022-08-04 |
B | 2 | 2022-08-04 |
C | 2 | 2022-08-04 |
Result Expected:
Case | Date |
---|---|
B | 2022-08-03 |
A | 2022-08-04 |
Explaination: I was hoping to fetch the first time a case moves from one stage to the other. So,
- B shifted from stage 1 to 2 from 1 Aug 2022 -> 3 Aug 2022 while the other 2 cases A and C didnt shift stage.
- A moved from Stage 1 to Stage 2 from Aug 1 -> Aug 4, hence its in the output as well.
Challenge faced: Every case can have a different date to move from Stage 1. So its not necessary that 2nd input table might apply for all cases. e.g - Case A moves out from Stage 1 -> Stage 2 on 4th august, that would be a part of the output too.
CodePudding user response:
Try the following:
with unioned as (
SELECT * FROM t1 UNION ALL
SELECT * FROM t2 UNION ALL
SELECT * FROM t3
)
SELECT `Case`
, min(date) as Date
from unioned
group by `Case`, Stage
qualify row_number() over (PARTITION BY `Case` ORDER BY Stage) = 2
It produces the following results: