I'm handling a table like so:
Name | Status | Date |
---|---|---|
Alfred | 1 | Jan 1 2023 |
Alfred | 2 | Jan 2 2023 |
Alfred | 3 | Jan 2 2023 |
Alfred | 4 | Jan 3 2023 |
Bob | 1 | Jan 1 2023 |
Bob | 3 | Jan 2 2023 |
Carl | 1 | Jan 5 2023 |
Dan | 1 | Jan 8 2023 |
Dan | 2 | Jan 9 2023 |
I'm trying to setup a query so I can handle the following: I'd like to pull the most recent status per Name,
SELECT MAX(Date), Status, Name
FROM test_table
GROUP BY Status, Name
Additionally I'd like in the same query to be able to pull if the user has ever had a status of 2, regardless of if the most recent one is 2 or not
WITH has_2_table AS (
SELECT DISTINCT Name, TRUE as has_2
FROM test_table
WHERE Status = 2 )
And then maybe joining the above on a left join on Name?
But having these as two seperate queries and joining them feels clunky to me, especially since I'd like to add additional columns and other checks. Is there a better way to set this up in one singular query, or is this the most effecient way?
CodePudding user response:
Given that you have a different partitioning on the two aggregations, you could try going with window functions instead:
SELECT DISTINCT Name,
MAX(Date) OVER(
PARTITION BY Name, Status
) AS lastdate,
MAX(CASE WHEN Status = 2 THEN 1 ELSE 0 END) OVER(
PARTITION BY Name
) AS status2
FROM tab
CodePudding user response:
Snowflake has sophisticated aggregate functions.
Using group by
, we can get the latest status with arrays and check for a given status with boolean aggregation:
select name, max(date) max_date,
get(array_agg(status) within group (order by date desc), 0) last_status,
boolor_agg(status = 2) has_status2
from mytable
group by name
We could also use window functions and qualify
:
select name, date as max_date,
status as last_status,
boolor_agg(status = 2) over(partition by name) has_status2
from mytable
qualify rank() over(order by name order by date desc) = 1