I have the first three fields of the following table. I want to compute the number of consecutive days an amount was higher than 0 ("days" field).
key | date | amount | days |
---|---|---|---|
1 | 2023-01-23 | 0 | 0 |
1 | 2023-01-22 | 10 | 2 |
1 | 2023-01-21 | 20 | 1 |
1 | 2023-01-20 | 0 | 0 |
1 | 2023-01-19 | 0 | 0 |
1 | 2023-01-18 | 0 | 0 |
1 | 2023-01-17 | 3 | 1 |
1 | 2023-01-16 | 0 | 0 |
I have tried with some windows function using this link. Did not add and reset to 1 if the previous amount is 0.
My code:
case when f.amount > 0
then SUM ( DATE_PART('day',
date::text::timestamp - previou_bus_date::text::timestamp )
) OVER (partition by f.key
ORDER BY f.date
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW )
else 0
end as days
CodePudding user response:
Another option, you could use the difference between two row_number
s approach as the following:
select key, date, amount,
sum(case when amount > 0 then 1 else 0 end) over
(partition by key, grp, case when amount > 0 then 1 else 0 end order by date) days
from
(
select *,
row_number() over (partition by key order by date) -
row_number() over (partition by key, case when amount > 0 then 1 else 0 end order by date) grp
from table_name
) T
order by date desc
CodePudding user response:
This problem falls into the gaps-and-islands kind of problem, as long as you need to compute consecutive values of non-null amounts.
You can reliably solve this problem in 3 steps:
- flagging when there's a change of partition, by using 1 when current amount > 0 and previous amount = 0
- compute a running sum (with
SUM
) on flags generated at step 1, to create your partitioning, which to observe the number of consecutive values on - compute a ranking (with
ROW_NUMBER
) to rank your non-null consecutive amounts in each partition generated at step 2
WITH cte AS (
SELECT *,
CASE WHEN amount > 0
AND LAG(amount) OVER(PARTITION BY key_ ORDER BY date_) = 0
THEN 1
END AS change_part
FROM tab
), cte2 AS (
SELECT *,
SUM(change_part) OVER(PARTITION BY key_ ORDER BY date_) AS parts
FROM cte
)
SELECT key_, date_, amount,
CASE WHEN amount > 0
THEN ROW_NUMBER() OVER(PARTITION BY key_, parts ORDER BY date_)
ELSE 0
END AS days
FROM cte2
ORDER BY date_ DESC
Check the demo here.
Note: This is not the most performant solution, although I'm leaving it for reference to the next part (missing consecutive dates). @Ahmed's answer is more likely to work better in this case.
If your data should ever have holes in dates (some missing records, making the consecutiveness of amounts no-more valid), you should add a further condition in Step 1, where you create the flag for changing partition.
The partition should change:
- either if when current amount > 0 and previous amount = 0
- or if current date is greater than previous date 1 day (consecutive dates are not consecutive in time)
WITH cte AS (
SELECT *,
CASE WHEN (amount > 0
AND LAG(amount) OVER(PARTITION BY key_ ORDER BY date_) = 0)
OR date_ > LAG(date_) OVER(PARTITION BY key_ ORDER BY date_)
INTERVAL '1 day'
THEN 1
END AS change_part
FROM tab
), cte2 AS (
...
Check the demo here.
CodePudding user response:
with data as (
select *,
sum(case when amount = 0 then 1 end)
over (partition by key order by date) as grp
from T
)
select key, date, amount,
count(*) over (partition by key, grp order by date) as days,
row_number() over (partition by key,grp order by date) as days2
from data;