Home > Mobile >  Get MIN aggregate field from table for the past X days from updated_at timestamp column
Get MIN aggregate field from table for the past X days from updated_at timestamp column

Time:06-14

Given the following table structure and sample data:

CREATE TABLE IF NOT EXISTS `records` (
  `id` int unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `external_id` int unsigned NOT NULL,
  `sub_id` int unsigned DEFAULT 0,
  `amount` bigint unsigned NOT NULL,
  `updated_at` TIMESTAMP DEFAULT CURRENT_TIMESTAMP
) DEFAULT CHARSET=utf8;

INSERT INTO `records` (`external_id`, `sub_id`,`amount`, `updated_at`) VALUES 
(1,    0, 160, '2022-01-13 16:00:00'),
(1, 1001, 150, '2022-01-13 16:40:00'),
(1, 1002, 170, '2022-06-13 16:40:00'),
(1, 1003, 170, '2022-06-13 16:40:00');

I'm trying to get the MIN value for amount for the past X (assume 30 days), for a given external_id, using the timestamp field updated_at, with the following constraints:

  • If there are no records for the past 30 days (changes), the latest record is still a valid one,
  • Each new record for a given external_id would "cancel and replace" the previous record,
  • If there are both records with sub_id = 0 and sub_id <> 0, for the same given external_id the records with sub_id <> 0 would prevail.

So, the query for the above data should return 150.

A fiddle and what I have tried at: https://dbfiddle.uk/?rdbms=mariadb_10.6&fiddle=e4ddd6b55dbccf607633c1cf7d9cd4ef

Extra Information (Later edit)

To give you a better picture of the whole idea here: Each time an update is made on the amount field, there is a new record created in the records table (to create a history log).

My task is to query the MIN amount for the past 30 days. Some external_id records also have that sub_id. Whether there is or isn't a sub_id, the record for the external_id would be created. Amounts with sub_id are usually bigger (something extra gets added to the amount for product_id).

CodePudding user response:

At the moment it's not clear what should happen if there are multiple external_id values. You seem to want only one row returned? (If this is not the case, please improve the example to include the desired results when there are multiple different external_id values.)

If you do just want one value returned, you could simply ORDER BY <something> LIMIT 1

I, however, am going to assume you want just one value per external_id.

WITH
  filtered_sorted
AS
(
  SELECT
    *,
    ROW_NUMBER() OVER (PARTITION BY external_id
                           ORDER BY CASE WHEN sub_id <> 0 THEN 0 ELSE 1 END,
                                    amount
                      )
                        AS rn
  FROM
    records  AS r
  WHERE
    updated_at >= (SELECT COALESCE(MAX(updated_at), DATE(NOW()) - INTERVAL 30 DAY)
                     FROM records
                    WHERE updated_at  <= DATE(NOW()) - INTERVAL 30 DAY
                      AND external_id  = r.external_id
                  )
)
SELECT
  *
FROM
  filtered_sorted
WHERE
  rn = 1

This is based on the notion that the most recent row on or before the start of the day 30 days ago is still valid and should be included in consideration for the lowest amount.

  • For each external_id...
    • Ignore all records updated_at after the start of 30 days ago
    • Return the most recent remaining updated_at (which could be exactly the start of the day 30 days ago)
    • If no such row is found, return the start of the day 30 days ago
    • We will consider all rows for that updated_at onwards

Then the ROW_NUMBER() prefers rows with a non-zero sub_id and then rows with the lowest amount.

  • For the rows being considered, after the WHERE clause described above
    • Assign each row a row number
    • Each external_id should have it's own individual sequence of row numbers (achieved with PARTITION BY external_id`)
    • Rows with sub_id <> 0 should come before before any rows with sub_id = 0 (achieved by ORDER BY CASE WHEN sub_id <> 0 THEN 0 ELSE 1 END)
    • Rows with lower amount values should come first (achieved with `ORDER BY amount.)

Then, just return rows where the assigned row number is 1

  • Partitioned by external_id
  • Filtered by updated_at
  • Sorted by sub_id <> 0, amount

(One row per external_id)


Demo on dbfiddle.uk

  • Related