I'm doing SQL exercises at WindowFunctions.com. The task is to cumulatively sum cats' weights from heaviest to lightest. The table looks like:
name | weight | ... |
---|---|---|
Ashes | 4.5 | ... |
Tigger | 3.8 | ... |
Smokey | 6.1 | ... |
... |
My first guess was
SELECT
name,
SUM(weight) OVER (ORDER BY weight DESC) AS running_total_weight
FROM cats
However, it produces wrong result which sums first two cats into first cumulative sum.
name | running_total_weight |
---|---|
Smokey | 12.2 |
Oscar | 12.2 |
... | |
Tigger | 61.1 |
If I, however, run the same query ascending, the summation is done correctly, and Tiger has weight 3.8 and not 8.0.
The site suggests solution: rewrite OVER ()
clause as OVER (ORDER BY weight DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
. Why does it works? Why is it needed when summation descending and not needed when descending?
CodePudding user response:
The point is not the unbounded preceding
.
The point is the default framing behavior when over (order by...)
is used, which is range
(and not rows
).
select *
--rows
,sum(c) over(order by c rows unbounded preceding) rows1
,sum(c) over(order by c rows between unbounded preceding and current row) rows2
--range
,sum(c) over(order by c) range1
,sum(c) over(order by c range unbounded preceding) range2
,sum(c) over(order by c range between unbounded preceding and current row) range3
from (values (10),(20),(20),(20),(30)) t(c)
c | rows1 | rows2 | range1 | range2 | range3 |
---|---|---|---|---|---|
10 | 10 | 10 | 10 | 10 | 10 |
20 | 30 | 30 | 70 | 70 | 70 |
20 | 50 | 50 | 70 | 70 | 70 |
20 | 70 | 70 | 70 | 70 | 70 |
30 | 100 | 100 | 100 | 100 | 100 |