I'd like to select all rows from a table until (and including) a certain value is reached per partition. In this case all rows per id that precede when status has the value 'b' for the last time. Note: the timestamp is in order per id
id | name | status | status | timestamp |
---|---|---|---|---|
1 | Sta | open | a | 10:50:09.000000 |
1 | Danny | open | c | 10:50:19.000000 |
1 | Elle | closed | b | 10:50:39.000000 |
2 | anton | closed | a | 16:00:09.000000 |
2 | jill | done | b | 16:00:19.000000 |
2 | tom | open | b | 16:05:09.000000 |
2 | bill | open | c | 16:07:09.000000 |
3 | ann | done | b | 08:00:13.000000 |
3 | stef | done | b | 08:12:13.000000 |
3 | martin | open | b | 08:25:13.000000 |
3 | jeff | open | a | 09:00:13.000000 |
3 | luke | open | c | 09:07:13.000000 |
3 | karen | open | c | 09:15:13.000000 |
3 | lucy | open | a | 10:00:13.000000 |
The output would look like this:
id | name | status | status | timestamp |
---|---|---|---|---|
1 | Sta | open | a | 10:50:09.000000 |
1 | Danny | open | c | 10:50:19.000000 |
1 | Elle | closed | b | 10:50:39.000000 |
2 | anton | closed | a | 16:00:09.000000 |
2 | jill | done | b | 16:00:19.000000 |
2 | tom | open | b | 16:05:09.000000 |
3 | ann | done | b | 08:00:13.000000 |
3 | stef | done | b | 08:12:13.000000 |
3 | martin | open | b | 08:25:13.000000 |
I've tried to solve this using qualify with rank etc. but unfortunately with no succes. would be appreciated if somebody would be able to help me!
CodePudding user response:
all rows per id that precede when status has the value 'b' for the last time is the same as no rows before value 'b' occurs the first time when you revert the sort order:
SELECT *
FROM tab
QUALIFY -- tag the last 'b'
Count(CASE WHEN status = 'b' THEN 1 end)
Over (PARTITION BY id
ORDER BY timestamp DESC
ROWS Unbounded Preceding) > 0
ORDER BY id, timestamp
;
This will not return ids where no 'b' exists.
If you want to return those, too, add another condition to QUALIFY:
OR -- no 'b' found
Count(CASE WHEN status = 'b' THEN 1 end)
Over (PARTITION BY id) = 0
As both counts share the same partition, it's still a single STAT step in Explain.