Consider the following table and data set.
CREATE TABLE MY_TABLE(
"Group" TEXT,
Callees INTEGER,
Callers INTEGER
);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 1, 505);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 2, 172);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 3, 33);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 4, 20);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 5, 5);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 6, 5);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 7, 3);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 8, 4);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 9, 3);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 10, 2);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 11, 1);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 13, 1);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 14, 1);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 16, 1);
INSERT INTO MY_TABLE("Group", Callees, Callers) VALUES ('Group1', 22, 2);
I want to create a cumulative distribution that shows cumulative % of total Callers based on the values in the Callees column, for the group. For example, for the value Callees = 1, I would like the first output row:
Group1 1 505 0.6662
Group1 2 172 0.8931
Group1 3 33 0.9366
...
Group1 22 1 1.0000
Where 0.6662 is 505 divided by total number of Callers, 758.Then the following rows should would contain the cumulative percentage as we go down the Callees column values.
I tried:
select "Group", Callers,
cume_dist() over ( order by Callees desc) as CumulativeP
from MY_TABLE
order by "Group", Callees
But it does not give me the output I want. How can I make this happen?
I am using SQLite 3.3x
CodePudding user response:
Solution for your problem:
select "Group", Callees,mt.Callers,
ROUND((sum(callers) over(Partition By "Group" order by Callees) * 1.00) / (sum(callers) over (Partition By "Group" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)),4) CumulativeP
from MY_TABLE mt
order by "Group", Callees;
Working Example: db fiddle link
Explanation:
Window
function contains Window Frame
to define the number of rows or the range that should be included by the function for calculation.
For example, to calculate cumulative sum, the sum should be applied from the first row till the current row for each row.
Now to calculate the sum over the whole table or for all rows (Within the defined partition), the range should be from first row to the last row of the table(or the partition if defined) and this can be defined with UNBOUNDED PRECEDING
(which tells to take the rows from the first row of the table or partition) and 'UNBOUNDED FOLLOWING' (which tell to take the rows till the last one in the table or partition).
Syntaxes of the Window Frame
are:
( GROUPS | RANGE | ROWS ) start
(GROUPS | RANGE | ROWS ) BETWEEN start AND end
Where start can be:
UNBOUNDED PRECEDING
expr PRECEDING
CURRENT ROW
expr FOLLOWING
and end can be:
expr PRECEDING
CURRENT ROW
UNBOUNDED FOLLOWING
expr FOLLOWING
For more info on Window Function
please see the below link:
Official Document
For better understanding about Window Frame
, follow below link:
Sqlite window frame