I have the following table in Teradata:
group1 | group2 |
---|---|
A | A |
A | B |
A | B |
A | C |
A | C |
A | C |
A | C |
B | C |
B | A |
B | A |
B | A |
B | D |
CREATE VOLATILE TABLE student
(
group1 varchar(10),
group2 varchar(10)
)
NO PRIMARY INDEX
ON COMMIT PRESERVE ROWS;
INSERT INTO student values ('A','A');
INSERT INTO student values ('A','B');
INSERT INTO student values ('A','B');
INSERT INTO student values ('A','C');
INSERT INTO student values ('A','C');
INSERT INTO student values ('A','C');
INSERT INTO student values ('A','C');
INSERT INTO student values ('B','A');
INSERT INTO student values ('B','A');
INSERT INTO student values ('B','A');
INSERT INTO student values ('B','A');
INSERT INTO student values ('B','D');
I am trying to get a table as follows, where the percentage column should be ordered ascending before cumulative sum:
group1 | group2 | count_both_groups | group1_count | percentage | cumsum |
---|---|---|---|---|---|
A | A | 1 | 7 | 0.143 | 0.143 |
A | B | 2 | 7 | 0.286 | 0.429 |
A | C | 4 | 7 | 0.571 | 1.000 |
B | D | 1 | 5 | 0.200 | 0.200 |
B | A | 4 | 5 | 0.800 | 1.000 |
I can get all columns right expect the last one with:
SELECT
group1,
group2,
COUNT (*) count_both_groups,
SUM (count_both_groups) OVER (PARTITION BY group1) AS group1_count,
(0.000 count_both_groups)/group1_count as percentage
FROM student
GROUP BY group1,group2
order BY group1,percentage;
To add last column as well, I try:
SELECT
group1,
group2,
COUNT (*) count_both_groups,
SUM (count_both_groups) OVER (PARTITION BY group1) AS group1_count,
(0.000 count_both_groups)/group1_count as percentage,
sum (percentage) over (partition by group1 order by count_both_groups asc ROWS UNBOUNDED PRECEDING) AS cumsum
FROM student
GROUP BY group1,group2;
I get the error:
Ordered Analytical Functions can not be nested.
CodePudding user response:
First group by group1
and group2
to get the column count_both_groups
and use SUM()
window function to get group1_count
and by dividing them get percentage
.
Finally use SUM()
window function again to get the cumulative sum of percentage
:
WITH cte AS (
SELECT *,
COUNT(*) count_both_groups,
SUM(COUNT(*)) OVER (PARTITION BY group1) group1_count,
1.0 * COUNT(*) / SUM(COUNT(*)) OVER (PARTITION BY group1) percentage
FROM student
GROUP BY group1, group2
)
SELECT *, SUM(percentage) OVER (PARTITION BY group1 ORDER BY percentage) cumsum
FROM cte;
See the demo (works in MySql and other databases).
CodePudding user response:
You can do this in a single pass like so, I think:
SELECT
group1,
group2,
COUNT (*) count_both_groups,
SUM (count_both_groups) OVER (PARTITION BY group1) AS group1_count,
(0.000 count_both_groups)/group1_count as percentage,
sum (0.000 count_both_groups) over (partition by group1 order by count_both_groups asc ROWS UNBOUNDED PRECEDING)/group1_count cum_sum
FROM student
GROUP BY group1,group2
order BY group1,cum_sum;