Home > other >  How can I partition by group that falls within a time range?
How can I partition by group that falls within a time range?

Time:11-09

I have the following table showing when customers bought a certain product. The data I have is CustomerID, Amount, Dat. I am trying to create the column ProductsIn30Days, which represents how many products a customer bought in the range Dat-30 days inclusive the current day.

For example, ProductsIn30Days for CustomerID 1 on Dat 25.3.2020 is 7, since the customer bought 2 products on 25.3.2020 and 5 more products on 24.3.2020, which falls within 30 days before 25.3.2020.

CustomerID Amount Dat ProductsIn30Days
1 1 23.3.2018 1
1 2 24.3.2020 2
1 3 24.3.2020 5
1 2 25.3.2020 7
1 2 24.5.2020 2
1 1 15.6.2020 3
2 7 24.3.2017 7
2 2 24.3.2020 2

I tried something like this with no success, since the partition only works on a single date rather than on a range like I would need:

select CustomerID, Amount, Dat,
sum(Amount) over (partition by CustomerID, Dat-30)
from table

Thank you for help.

CodePudding user response:

You can use an analytic SUM function with a range window:

SELECT t.*,
       SUM(Amount) OVER (
         PARTITION BY CustomerID
         ORDER BY Dat
         RANGE BETWEEN INTERVAL '30' DAY PRECEDING AND CURRENT ROW
       ) AS ProductsIn30Days
FROM   table_name t;

Which, for the sample data:

CREATE TABLE table_name (CustomerID, Amount, Dat) AS
SELECT 1, 1, DATE '2018-03-23' FROM DUAL UNION ALL
SELECT 1, 2, DATE '2020-03-24' FROM DUAL UNION ALL
SELECT 1, 3, DATE '2020-03-24' FROM DUAL UNION ALL
SELECT 1, 2, DATE '2020-03-25' FROM DUAL UNION ALL
SELECT 1, 2, DATE '2020-05-24' FROM DUAL UNION ALL
SELECT 1, 1, DATE '2020-06-15' FROM DUAL UNION ALL
SELECT 2, 7, DATE '2017-03-24' FROM DUAL UNION ALL
SELECT 2, 2, DATE '2020-03-24' FROM DUAL;

Outputs:

CUSTOMERID AMOUNT DAT PRODUCTSIN30DAYS
1 1 2018-03-23 00:00:00 1
1 2 2020-03-24 00:00:00 5
1 3 2020-03-24 00:00:00 5
1 2 2020-03-25 00:00:00 7
1 2 2020-05-24 00:00:00 2
1 1 2020-06-15 00:00:00 3
2 7 2017-03-24 00:00:00 7
2 2 2020-03-24 00:00:00 2

Note: If you have values on the same date then they will be tied in the order and always aggregated together (i.e. rows 2 & 3). If you want them to be aggregated separately then you need to order by something else to break the ties but that would not work with a RANGE window.

db<>fiddle here

  • Related