I have a table that is something like this:
amount | status |
---|---|
10 | A |
10 | B |
15 | B |
10 | C |
12 | D |
20 | A |
25 | B |
17 | C |
19 | D |
The amounts have no restriction (other than being a number). And status lines can have duplicates (the 'B' in the example).
What I want is to sum over everything between 'A' status. So the result should be
sum |
---|
57 |
81 |
I need this for ansi-sql (Spark)
CodePudding user response:
Once you will have decided about your "order" column, one possible solution:
with data(ord, amount, status) as (
select 1, 10, 'A' from dual union all
select 2, 10, 'B' from dual union all
select 3, 15, 'B' from dual union all
select 4, 10, 'C' from dual union all
select 5, 12, 'D' from dual union all
select 6, 20, 'A' from dual union all
select 7, 25, 'B' from dual union all
select 8, 17, 'C' from dual union all
select 9, 19, 'D' from dual
),
pdata as (
select d.*, case status when 'A' then lv else last_value(lv) ignore nulls over(order by ord) end as llv
from (
select d.*,
nvl(lag(ord) over(order by ord),ord-1) as p_ord,
nvl(last_value(case status when 'A' then ord end) over(partition by status order by ord
RANGE BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
), case status when 'A' then ord - 1
end) as lv
, nvl(lead(ord) over(order by ord),ord 1) as n_ord
from data d
) d
)
select sum(amount) from pdata
where llv is not null
group by llv
;
sum(amount)
57
81
Note that replacing "when 'A' then lv else" by "when 'A' then null else" will give you the sum of rows strictly between the 2 'A' (not including the first one).
CodePudding user response:
To sum elements between A
, there must have an order on managed datas. If there is, such as numeric ID, you can say it and I will use it to suggest a solution.