Home > Blockchain >  Running total percentage per record over total sum. Presto/Athena/SQL
Running total percentage per record over total sum. Presto/Athena/SQL

Time:10-20

I'm trying to calculate the cumulative percentage for each row Presto/Athena. Eg: If I have data like this

AccountID | UserID | HolidaysTaken
ABC       | A      | 4
ABC       | B      | 6
ABC       | B      | 3
ABC       | K      | 2
ABC       | K      | 3
ABC       | X      | 1

Now after running this query I get the below result.

SELECT AccountID, UserID, sum(HolidaysTaken) AS HolidaysTaken FROM table
WHERE AccountID = 'ABC'
GROUP BY AccountID, UserID
ORDER BY HolidaysTaken DESC

AccountID | UserID | HolidaysTaken 
ABC       | B      | 9             
ABC       | K      | 5             
ABC       | A      | 4             
ABC       | X      | 1 

Total holiday taken by all users = 19         

But I want to add 2 more columns. EachUserPercentage: The Percentage of holidays taken by each user from total holidays. CumulativePercentage: Cumulative sum of EachUserPercentage. This one I can do using this post

AccountID | UserID | HolidaysTaken | EachUserPercentage | CumulativePercentage
ABC       | B      | 9             | 47.36              | 47.36  
ABC       | K      | 5             | 26.31              | 73.67
ABC       | A      | 4             | 21.05              | 94.72
ABC       | X      | 1             | 5.26               | 100

I tried diff window function percent_rank(), cume_dist() and ntile() but can't get proper EachUserPercentage working.

CodePudding user response:

You can use window functions to find the percentage for an AccountID, then another window function to sum this over rows unbound preceding ordered by the total holidays taken per UserID. Something like as follows:

WITH totalUser
AS (SELECT   AccountID
            ,UserID
            ,SUM(HolidaysTaken) AS HolidaysTaken
            ,CAST(100.0 * SUM(HolidaysTaken) / SUM(SUM(HolidaysTaken)) OVER (PARTITION BY AccountID) AS NUMERIC(5, 2)) AS EachUserPercentage
    FROM     table
    WHERE    AccountID = 'ABC'
    GROUP BY AccountID
            ,UserID)
SELECT   totalUser.AccountID
        ,totalUser.UserID
        ,totalUser.HolidaysTaken
        ,totalUser.EachUserPercentage
        ,SUM(totalUser.EachUserPercentage) OVER (PARTITION BY totalUser.AccountID
                                                 ORDER BY totalUser.EachUserPercentage DESC
                                                 ROWS UNBOUNDED PRECEDING)
FROM     totalUser
ORDER BY totalUser.HolidaysTaken DESC;

CodePudding user response:

Hi you can get simply EachUserPercentage from below query if your group is on AccountID(considered).

SELECT table.AccountID, UserID, sum(table.HolidaysTaken) AS HolidaysTaken,
MAX(CAST(all_sum.HolidaysTaken AS NUMERIC(12,2))),
(SUM(CAST(table.HolidaysTaken AS NUMERIC(12,2)))/MAX(CAST(all_sum.HolidaysTaken AS NUMERIC(12,2))))*100 EachUserPercentage
 FROM table
LEFT OUTER JOIN (SELECT SUM(HolidaysTaken) AS HolidaysTaken,AccountID FROM table GROUP BY AccountID)all_sum ON all_sum.AccountID= table.AccountID
WHERE table.AccountID = 'ABC'
GROUP BY table.AccountID, UserID
ORDER BY HolidaysTaken DESC

It works at my end.

  • Related