Home > OS >  retrieve sql records where only the last unique entries match criteria in postgresql
retrieve sql records where only the last unique entries match criteria in postgresql

Time:06-16

I've got a long table that tracks a numerical 'state' value (0=new, 1=setup mode, 2=retired, 3=active, 4=inactive) of a collection of 'devices' historically. These devices may be activated/deactivated throughout the year, so the table is continuous collection of state changes - mostly state 3 and 4, ordered by id, with a timestamp on the end, for example:

     id    | device_id | new_state |        when         
 ---------- ----------- ----------- ----------------------------
 218010581 |      2505 |         0 | 2022-06-06 16:28:11.174084
 218010580 |      2505 |         1 | 2022-06-06 16:28:11.174084
 218010634 |      2505 |         3 | 2022-06-06 16:29:25.129019
 218087737 |       659 |         3 | 2022-06-07 22:55:48.705208
 218087744 |      1392 |         3 | 2022-06-07 22:55:59.016974
 218087757 |      1556 |         3 | 2022-06-07 22:56:09.811876
 218087758 |      2071 |         1 | 2022-06-07 22:56:20.850095
 218087765 |      2071 |         3 | 2022-06-07 22:56:29.122074

When I want to look for a list of devices and see their 'history', I know I can use something like:

select * from devstatechange where device_id=2345 order by "when";

    id     | device_id | new_state |            when            
----------- ----------- ----------- ----------------------------
 184682659 |      2345 |         0 | 2021-05-27 17:03:36.894429
 184682658 |      2345 |         1 | 2021-05-27 17:03:36.894429
 184684721 |      2345 |         3 | 2021-05-27 17:31:01.968314
 194933399 |      2345 |         4 | 2021-08-31 23:30:05.555407
 195213746 |      2345 |         3 | 2021-09-03 16:53:39.043005
 206278232 |      2345 |         4 | 2021-12-31 22:30:08.820068
 206515355 |      2345 |         3 | 2022-01-03 16:06:01.223759
 215709888 |      2345 |         4 | 2022-04-30 23:30:30.309389
 215846807 |      2345 |         3 | 2022-05-02 19:40:31.525514


select * from devstatechange where device_id=2351 order by "when";

    id     | device_id | new_state |            when            
----------- ----------- ----------- ----------------------------
 186091252 |      2351 |         0 | 2021-06-09 15:36:02.775035
 186091253 |      2351 |         1 | 2021-06-09 15:36:02.775035
 186091349 |      2351 |         3 | 2021-06-09 15:37:56.965599
 197880878 |      2351 |         4 | 2021-09-30 23:30:06.691835
 197945073 |      2351 |         3 | 2021-10-01 15:32:35.907913
 208981857 |      2351 |         4 | 2022-01-31 22:30:09.521694
 209722639 |      2351 |         3 | 2022-02-09 15:20:12.412816
 217666572 |      2351 |         4 | 2022-05-31 23:30:30.881928

What I am really looking for is a query that returns a unique list of devices where the latest dated entry for each device only contains a state of '4' ('inactive state'), and not include records that do not match.

So in using the above data samples, even though both devices 2345 and 2351 have states of 3 and 4 throughout their history, only device 2351 has it's last dated entry with a state of 4 - meaning it is currently in an 'inactive' state. Device 2345's would not appear in the result set since its last dated entry has a state of 3 - it's still active.

Stabbing in the dark, I've tried variants of:

SELECT DISTINCT * FROM devstatechange WHERE MAX("when") AND new_state=4 ORDER BY "when";
SELECT DISTINCT device_id, new_state, MAX("when") FROM devstatechange WHERE new_state=4  ORDER BY "when";

with obviously no success.

I'm thinking I might need to 'group' the entries together, but I don't know how to specify 'return last entry only if new_state = 4' in SQL, or rather PostgreSQL.

Any tidbits or pokes in the right direction would be appreciated.

CodePudding user response:

SELECT * FROM (
  SELECT DISTINCT ON (device_id)
  *
  FROM devstatechange
  ORDER BY device_id, "when" DESC
)
WHERE new_state = 4;

The DISTINCT ON keyword together with the ORDER BY will pull the newest row for each device. The outer query then filters these by your condition.

CodePudding user response:

You may use Row_Number() function with a partition by device_id and order by when.

Try the following CTE:

with cte as 
(
  Select id ,device_id ,new_state  ,when_ ,
  row_number() over (partition by device_id order by when_ desc) as rn
  from devstatechange
)
select * from cte where rn=1 and new_state=4

See a demo from db-fiddle.

  • Related