Home > Net >  How to get latest Timestamp_NTZ per row in Snowflake
How to get latest Timestamp_NTZ per row in Snowflake

Time:11-22

Platform: Snowflake

I am building a report that requires the most up to date department information. Some departments have been updated and take on a new timestamp date showing they are more up to date. I've only been dabbling in SQL for a year, and so far I've had a great run with most other queries I use to automate dashboards, but this one is stumping me even after googling etc.

I tried writing this:

select MAX(EFFDT), DEPTID, EFF_STATUS, DESCR
from "DATA_LAKE"."PS_FS"."PS_DEPT_TBL";

I get this error:

SQL compilation error: [PS_DEPT_TBL.DEPTID] is not a valid group by expression 

Perhaps a partition is needed? If so, how would I structure it?

Example:

Row   Dept     EFFDT            
Row 1 DEPT78   1990-01-01 00:00:00.0000

Row 2 DEPT78   1990-01-02 00:00:00.0000   <  I want this record/row to populate and not row 1 above.

Row 3 DEPT79   1990-01-01 00:00:00.0000

Row 4 DEPT79   1990-01-02 00:00:00.0000   <  I want this record/row to populate and not row 3 above.

Any help is greatly appreciated.

I tried researching different ways to pull most up to date data but failed as I believe I need a more dense query to get desired results.

CodePudding user response:

The sample data does not line up with the SQL but it appears that you could group by the columns other than EFFDT:

select MAX(EFFDT), DEPTID, EFF_STATUS, DESCR
from "DATA_LAKE"."PS_FS"."PS_DEPT_TBL"
group by DEPTID, EFF_STATUS, DESCR
;

If that does not work it's because it's grouping on too many columns. If you just need to group by "DEPTID" for example but need the other columns:

select MAX(EFFDT), DEPTID, max(EFF_STATUS) as EFF_STATUS, max(DESCR) as DESCR
from "DATA_LAKE"."PS_FS"."PS_DEPT_TBL"
group by DEPTID
;

Someone may ask why use max when the max has no significance and there's an any_value function that would get the same result. The reason is that the any_value is currently marked as ineligible to use the result set cache. Using max or min will allow the result set cache to be used if the query is eligible for it.

  • Related