For example, I have this table:
seenID | personID | seenTime
------- ---------- ---------
108 3 13:34
109 2 13:56
110 3 14:22
111 3 14:31
112 4 15:04
113 2 15:52
114 3 15:55
I want to get the row where the PersonID occurs for the last but one time.
So the desired output needs to be (for one of the person ID) :
seenID | personID | seenTime
------- ---------- ---------
111 3 14:31
** I am using this query to get the nth occurrence:
SELECT seenID,personID,seenTime FROM
(
SELECT ROW_NUMBER() OVER(PARTITION BY personID ORDER BY personID) AS row_num,*
FROM "YourTableName"
)AS T
WHERE row_num = 2
but to get the last but one occurrence, I will not be knowing the value of n and n value will be different for different personID.
CodePudding user response:
You can try using row_number()
with cte as
(
select seenid,personid,seentime,
row_number() over(partition by personid order by seentime desc) as rn
from tablename
)
select * from cte where rn=1
CodePudding user response:
You can use min
/max
(or min_by
/max_by
if needed) overload which returns an array with n largest values and combine it with element_at
(maybe will need to wrap it in try()
, but for Athena flavour of presto works fine as is):
WITH dataset(seenID,personID,seenTime) AS (
VALUES
(108,3,'13:34'),
(109,2,'13:56'),
(110,3,'14:22'),
(111,3,'14:31'),
(112,4,'15:04'),
(113,2,'15:52'),
(114,3,'15:55')
)
SELECT personID, element_at(min(seenTime, 2), 2) second_from_start, element_at(max(seenTime, 2), 2) second_from_end
FROM dataset
group by personID
Output:
personID | second_from_start | second_from_end |
---|---|---|
3 | 14:22 | 14:31 |
2 | 15:52 | 13:56 |
4 |
CodePudding user response:
You are on the right track, but you want a descending order by
. That way, you always know you want the second one:
SELECT seenID, personID, seenTime
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY personID
ORDER BY personID DESC
--------------------------------------------------^
) AS row_num,
t.*
FROM "YourTableName" t
) AS T
WHERE row_num = 2