Home > Mobile >  SQL selecting rows by most recent date with one unique column (selecting 3 columns)
SQL selecting rows by most recent date with one unique column (selecting 3 columns)

Time:06-30

In my postgres db I have a table my_table with columns ean, price, dtmod

ean price dtmod
155105 2.19 2022-06-22 03:03:25.43045 02
155105 -0.01 2022-06-28 02:27:15.478475 02
155105 1.45 2022-06-28 15:11:35.558692 02
114695 -0.01 2022-06-28 02:27:15.448782 02
114695 5.99 2022-06-28 15:11:27.689637 02
213786 -0.01 2022-06-28 02:27:15.468477 02
213786 2.39 2022-06-28 15:11:32.284314 02

How can I filter data in this table in order to get the latest dtmod for each unique ean (and also showing the price), ordered by ean?

Expected Output:

ean price dtmod
114695 5.99 2022-06-28 15:11:27.689637 02
155105 1.45 2022-06-28 15:11:35.558692 02
213786 2.39 2022-06-28 15:11:32.284314 02

I have found a possible solution, but I think it is not the most concise and efficient one.

step 1:
extract the latest date for all the ean

select ean, max(dtmod) 
from my_table 
group by ean;
  ean   |              max              
-------- -------------------------------
 155105 | 2022-06-28 15:11:35.558692 02
 114695 | 2022-06-28 15:11:27.689637 02
 213786 | 2022-06-28 15:11:32.284314 02

step 2:
extract ean, price and dtmod from my_table where ean and price match the results from the previous query

select ean, price, dtmod 
from my_table 
where (ean, dtmod) in 
     (
     select ean, max(dtmod) 
     from my_table 
     group by ean
     ) 
order by ean;
   ean  | price  |            dtmod       
-------- -------- ------------------------------
 114695 |   5.99 | 2022-06-28 15:11:27.689637 02
 155105 |   1.45 | 2022-06-28 15:11:35.558692 02
 213786 |   2.39 | 2022-06-28 15:11:32.284314 02

Any suggestions on how to improve the query?

CodePudding user response:

Your query would be like this

SELECT DISTINCT ON (ean)
    ean, 
    LAST_VALUE(price) OVER (PARTITION BY ean ORDER BY dtmod DESC) AS price, 
    MAX(dtmod) OVER (PARTITION BY ean) AS dtmod
FROM my_table  

Fiddle

CodePudding user response:

You can use the handy FETCH FIRST <n> ROWS WITH TIES clause, combined with a ranking, given by the biggest dtmod for each "ean".

SELECT * 
FROM tab
ORDER BY ROW_NUMBER() OVER(PARTITION BY ean ORDER BY dtmod DESC)
FETCH FIRST 1 ROWS WITH TIES

Check the demo here.

CodePudding user response:

The classic way to do this and still very useful in this case:

select 
  ean, 
  price,
  dtmod
from my_table t0
where
  t0.dtmod = (
    select max(dtmod) 
    from my_table t1 
    where t1.ean = t0.ean
  )
order by ean
;

Also, super helpful for everyone is to know how to use the window functions, any SQL programmer should spend some time to understand all of the window functions.

with
w_o as (
  select
  ean,
  price,
  dtmod,
  row_number() OVER (PARTITION BY ean ORDER BY dtmod DESC) rn
  from my_table
)
select
  ean,
  price,
  dtmod
from w_o
where rn = 1
order by ean
;

Note:

For testing, I added a duplicate record with same EAN, and same timestamp to the test data. The first query gives both the second gives only one. You can determine which one it returns by tuning the ordering.

Edit 1. I forgot the order by. :)

Edit 2. Added window function information.

http://sqlfiddle.com/#!17/8278a/1/1

CodePudding user response:

You can use a simple distinct on without window functions. (see demo )

select distinct on (ean) 
       ean
     , price
     , dtmod 
  from my_table
 order by ean, dtmod desc; 
  • Related