run select *, return_date - pickup_date as total from order_history order by id;
return the following result:
id pickup_date return_date date_ranges total
1 2020-03-01 2020-03-12 [2020-03-01,2020-04-01) 11
2 2020-03-01 2020-03-22 [2020-03-01,2020-04-01) 21
3 2020-03-11 2020-03-22 [2020-03-01,2020-04-01) 11
4 2020-02-11 2020-03-22 [2020-02-01,2020-03-01) 40
5 2020-01-01 2020-01-22 [2020-01-01,2020-02-01) 21
6 2020-01-01 2020-04-22 [2020-01-01,2020-02-01) 112
for example:
--id=6. total = 112. 112 = 22 31 29 30
--therefore toal should split: jan2020: 30, feb2020:29, march2020: 31, 2020apr:22.
first split then aggregate. aggregate based over range min(pickup_date), max(return_date)
then tochar cast to 'YYYY-MM'; In this case the aggregate should group by 2020-01
, 2020-02
, 2020-03
,2020-04
.
but if pickup_date
in the same month with return_date
then compuate return_date - pickup_date
then aggregate/sum the result, group by to_char(pickup_date,'YYYY-MM')
CodePudding user response:
Not quite perfect, but a sketch:
SELECT
id,
ARRAY_AGG( -- 4
LEAST(return_date, gs interval '1 month - 1 day') -- 2
- GREATEST(pickup_date, gs) -- 3
interval '1 day'
)
FROM order_history,
generate_series( -- 1
date_trunc('month', pickup_date),
date_trunc('month', return_date),
interval '1 month'
) gs
GROUP BY id
- Generate a set of months that are included in the given date range
- a) Calculate the last day of the month (first of a month 1 month is first of the next month; minus 1 day is last of the current month). This is the max day for returning in this month. b) if it happened earlier, then take the earler day (
LEAST()
) - Same for pickup day. Afterwards calculate the difference of the days kept in one month.
- Aggregate the values for one month.
Open questions / Potential enhancements:
- You said:
jan2020: 30, feb2020:29, march2020: 31, 2020apr:22.
Why is JAN given with 30 days? On the other hand you count APR 22 days (1st - 22nd). Following the logic, JAN should be 31, shouldn't it?
If you don't want to count the very first day, then you can change (3.) to
GREATEST(pickup_date interval '1 day', gs)
- There's a problem with day saving time in March (30 days, 23 hours instead of 31 days). This can be faced by some rounding, for example.