Home > Mobile >  How to return all records with the latest datetime value [Postgreql]
How to return all records with the latest datetime value [Postgreql]

Time:06-04

How can I return only the records with the latest upload_date(s) from the data below?

My data is as follows:

upload_date            |day_name |rows_added|row_count_delta|days_since_last_update|
----------------------- --------- ---------- --------------- ---------------------- 
2022-05-01 00:00:00.000|Sunday   |    526043|               |                      |
2022-05-02 00:00:00.000|Monday   |    467082|         -58961|                     1|
2022-05-02 15:58:54.094|Monday   |    421427|         -45655|                     0|
2022-05-02 18:19:22.894|Monday   |    421427|              0|                     0|
2022-05-03 16:54:04.136|Tuesday  |    496021|          74594|                     1|
2022-05-03 18:17:27.502|Tuesday  |    496021|              0|                     0|
2022-05-04 18:19:26.392|Wednesday|    487154|          -8867|                     1|
2022-05-05 18:18:15.277|Thursday |    489713|           2559|                     1|
2022-05-06 16:15:39.518|Friday   |    489713|              0|                     1|
2022-05-07 16:18:00.916|Saturday |    482955|          -6758|                     1|

My desired results should be:

upload_date            |day_name |rows_added|row_count_delta|days_since_last_update|
----------------------- --------- ---------- --------------- ---------------------- 
2022-05-01 00:00:00.000|Sunday   |    526043|               |                      |
2022-05-02 18:19:22.894|Monday   |    421427|              0|                     0|
2022-05-03 18:17:27.502|Tuesday  |    496021|              0|                     0|
2022-05-04 18:19:26.392|Wednesday|    487154|          -8867|                     1|
2022-05-05 18:18:15.277|Thursday |    489713|           2559|                     1|
2022-05-06 16:15:39.518|Friday   |    489713|              0|                     1|
2022-05-07 16:18:00.916|Saturday |    482955|          -6758|                     1|

NOTE only the latest upload_date for 2022-05-02 and 2022-05-03 should be in the result set.

CodePudding user response:

You can use a window function to PARTITION by day (casting the timestamp to a date) and sort the results by most recent first by ordering by upload_date descending. Using ROW_NUMBER() it will assign a 1 to the most recent record per date. Then just filter on that row number. Note that I am assuming the datatype for upload_date is TIMESTAMP in this case.

SELECT 
  *
FROM (
  SELECT 
    your_table.*, 
    ROW_NUMBER() OVER (PARTITION BY CAST(upload_date AS DATE) 
                       ORDER BY upload_date DESC) rownum 
  FROM your_table
)
WHERE rownum  = 1

CodePudding user response:

demo


WITH cte AS (
    SELECT
        max(upload_date) OVER (PARTITION BY upload_date::date),
        upload_date,
        day_name,
        rows_added,
        row_count_delta,
        days_since_last_update
    FROM test101 ORDER BY 1
)
SELECT
    upload_date,
    day_name,
    rows_added,
    row_count_delta,
    days_since_last_update
FROM
    cte
WHERE
    max = upload_date;

CodePudding user response:

This is more verbose but I find it easier to read and build:

SELECT *
FROM mytable t1
JOIN (
    SELECT CAST(upload_date AS DATE) day_date, MAX(upload_date) max_date 
    FROM mytable 
    GROUP BY day_date) t2 
ON t1.upload_date = t2.max_date AND
    CAST(upload_date AS DATE) = t2.day_date;

I don't know about perfomance right away, but I suspect the window function is worse because you will need to order by, which is usually a slow operation unless your table already have an index for doing so.

  • Related