I need to sum values in intersect of range dates.
sample of source data
person | item | start_date | end_date | value |
---|---|---|---|---|
a | apple | 08.03.2018 | 29.03.2018 | 3 |
a | apple | 01.01.2019 | 08.08.2021 | 2 |
a | apple | 01.01.2019 | 09.10.2021 | 5 |
a | pen | 10.10.2021 | 30.10.2021 | 2 |
a | cup | 08.03.2018 | 20.03.2018 | 8 |
a | cup | 15.03.2018 | 20.03.2019 | 2 |
b | pen | 10.10.2021 | 30.10.2021 | 2 |
b | pen | 10.10.2021 | 30.10.2021 | 6 |
b | orange | 10.11.2021 | 10.11.2022 | 3 |
b | orange | 20.11.2021 | 20.12.2021 | 2 |
expected result
person | item | start_date | end_date | value |
---|---|---|---|---|
a | apple | 08.03.2018 | 29.03.2018 | 3 |
a | apple | 01.01.2019 | 08.08.2021 | 7 |
a | apple | 09.08.2021 | 09.10.2021 | 5 |
a | pen | 10.10.2021 | 30.10.2021 | 2 |
a | cup | 08.03.2018 | 14.03.2018 | 8 |
a | cup | 15.03.2018 | 20.03.2018 | 10 |
a | cup | 21.03.2018 | 20.03.2019 | 2 |
b | pen | 10.10.2021 | 30.10.2021 | 8 |
b | orange | 10.11.2021 | 19.11.2021 | 3 |
b | orange | 20.11.2021 | 20.12.2021 | 5 |
b | orange | 21.12.2021 | 10.11.2022 | 3 |
I use something code like this, but it is to simple, and results are not good
select
person
,item
,Min([start_date]) as [start_date]
,Max([end_date]) as [end_date]
,Sum([value]) as [value]
FROM table
Group by person, item
I tried to use LAG() function, but i'm lost
CodePudding user response:
I have no access to Synapse , but assuming it's compatibile with SQL server...
Internal query build data ranges, creating additional dates for overlapping periods if needed. Main query just sum values.
select person, item, range_from, range_to,
(select sum(value) from test
where person = r.person
and item = r.item
and range_from between start_date and end_date) value
from (
select
be,
person,
item,
date range_from,
lead(date,1) over(partition by person, item order by date,be) range_to
from (
select 1 be, person, item, start_date date from test
union
select 2, person, item, end_date from test
union
select 2, person, item, dateadd(day,-1,start_date) from test a
where exists (select * from test where a.person = person and a.item = item and a.start_date > start_date and a.start_date < end_date)
union
select 1, person, item, dateadd(day,1,end_date) from test b
where exists (select * from test where b.person = person and b.item = item and b.end_date > start_date and b.end_date < end_date)
) k
) r where r.be = 1 order by r.person, r.item, r.range_from
column be
contains:
- 1 - for period start
- 2 - for period end