Home > OS >  SQL query to aggregate amount from product transaction table
SQL query to aggregate amount from product transaction table

Time:02-06

Need to aggregate latest product prices of all products from batchTransaction Table, relevant Columns:

  • id - Unique
  • productId - Not unique
  • transactionValue - Value of that transaction
  • transactionDate - date of that transaction

A product can have multiple transactions but only latest needs to be considered for aggregation. Need to aggregate total transactionValue across plant at a provided date, for all products.

SELECT SUM(transactionQuantity)
FROM batchTransaction
WHERE (id, dateCreated) IN (
    SELECT id, MAX(dateCreated)
    FROM batchTransaction
    WHERE AND transactionDate < 1675189800000
    GROUP BY productId
);

Above query would have worked, but it gives error - this is incompatible with sql_mode=only_full_group_by

CodePudding user response:

The only way this makes sense with your description of getting the latest transaction per product is to group by productId in the subquery, and use productId in the result. Then compare that to the productid in the outer query.

SELECT SUM(transactionQuantity)
FROM batchTransaction
WHERE (productId, dateCreated) IN (
    SELECT productId, MAX(dateCreated)
    FROM batchTransaction
    WHERE transactionDate < 1675189800000
    GROUP BY productId
);

I also removed a superfluous AND keyword from your subquery.

I assume from this that transactionDate is stored as a BIGINT representing the UNIX timestamp in milliseconds, not as a DATETIME type.

A more modern way to write this sort of query is to use a window function ROW_NUMBER() and select only those that are the first (latest) row in each partition by productId.

SELECT SUM(transactionQuantity)
FROM (
    SELECT transactionQuantity, 
      ROW_NUMBER() OVER (PARTITION BY productId ORDER BY dateCreated DESC) AS rownum
    FROM batchTransaction
    WHERE transactionDate < 1675189800000
) AS t
WHERE t.rownum = 1;

This syntax requires MySQL 8.0 for the window function.

CodePudding user response:

Avoid IN clause as it causes performance issue when run on a larger dataset. Join would be a better option in such scenarios.

The id in the IN clause, is not guaranteed that it is of latest transaction because when you do group by product id, records with same product id are grouped and the order of those records is not maintained as you are assuming.

Query to achieve the right results

select sum(t.transactionQuantity)
from
(select
  cast(substring_index(group_concat(
   transactionQuantity
   order by transactionDate desc separator ','
  ), ',', 1) as unsigned) as transactionQuantity
  from
  batchTransaction
  group by productId
) as t;
  • Related