Below you will find my table DDL, inserts and query. I'm simply trying to calculate a percentage based on my_budget
over total_budget
. For all intents and purpose total budget is a static amount so it will not differentiate from row to row. The issue however is that in certain situations it appears that the total percent calculation is greater than 100%.
If you look at my results I end up with 100.01. Can someone give me a pointer how to prevent that from happening. I feel like I'm missing something simple on this one. I really don't care where the -.01 gets reduced from my percent calculation to get an even 100%.
-------------------------------------------------------
-- DDL for Table PLAY_TABLE
--------------------------------------------------------
CREATE TABLE PLAY_TABLE
(MY_BUDGET NUMBER(11,2),
VCOAS VARCHAR2(1 CHAR),
VFUND VARCHAR2(6 CHAR),
VORGN VARCHAR2(6 CHAR),
VACCT VARCHAR2(6 CHAR),
TOTAL_BUDGET NUMBER(11,2)
)
REM INSERTING into PLAY_TABLE
SET DEFINE OFF;
Insert into PLAY_TABLE (MY_BUDGET,VCOAS,VFUND,VORGN,VACCT,TOTAL_BUDGET) values (535.5,'D','110001','3013','2041',101745);
Insert into PLAY_TABLE (MY_BUDGET,VCOAS,VFUND,VORGN,VACCT,TOTAL_BUDGET) values (4819.5,'D','110001','3304','2041',101745);
Insert into PLAY_TABLE (MY_BUDGET,VCOAS,VFUND,VORGN,VACCT,TOTAL_BUDGET) values (96390,'D','110001','3304','2211',101745);
SELECT round((my_budget / total_budget) * 100, 2) calculated_budget,
total_budget,
vcoas,
vfund,
vorgn,
vacct,
my_budget
FROM play_table
--results--
0.53 D 110001 3013 2041 535.5 101745
4.74 D 110001 3304 2041 4819.5 101745
94.74 D 110001 3304 2211 96390 101745
CodePudding user response:
Rather than rounding, you can truncate the values and you will get a total that is less than 100 and the amount you are out from 100 can be considered the "excess" that you need to distribute. After that, you can find the "difference" between the truncated value and the actual percentage and then spread that excess, in units of the smallest scale you are using (which for your example when you are rounding to 2 decimal places would be 0.01 or 1e-2) amongst the rows with the highest "difference".
Which would be:
SELECT calculated_budget
CASE
WHEN ROW_NUMBER() OVER (ORDER BY difference DESC) <= excess * 1e2
THEN 1e-2
ELSE 0
END AS calculated_budget,
total_budget,
vcoas,
vfund,
vorgn,
vacct,
my_budget
FROM (
SELECT TRUNC(my_budget / total_budget * 100 * 1e2) / 1e2 AS calculated_budget,
my_budget / total_budget * 100 - TRUNC(my_budget / total_budget * 100 * 1e2) / 1e2
AS difference,
100 - SUM(TRUNC(my_budget / total_budget * 100 * 1e2) / 1e2) OVER () AS excess,
total_budget,
vcoas,
vfund,
vorgn,
vacct,
my_budget
FROM play_table
);
Note: 100
represents converting decimal to a percentage and scientific notation (i.e. 1e2
and 1e-2
) is used when the numbers are dealing with the precision you are rounding to (i.e. 2 decimal places).
Or, without the intermediate variables:
SELECT TRUNC(my_budget / total_budget * 100 * 1e2) / 1e2
CASE
WHEN ROW_NUMBER() OVER (
ORDER BY my_budget / total_budget * 100
- TRUNC(my_budget / total_budget * 100 * 1e2) / 1e2 DESC
) <= (100 - SUM(TRUNC(my_budget / total_budget * 100 * 1e2) / 1e2) OVER ()) * 1e2
THEN 1e-2
ELSE 0
END AS calculated_budget,
total_budget,
vcoas,
vfund,
vorgn,
vacct,
my_budget
FROM play_table;
Which, for the sample data, outputs:
CALCULATED_BUDGET | TOTAL_BUDGET | VCOAS | VFUND | VORGN | VACCT | MY_BUDGET |
---|---|---|---|---|---|---|
4.74 | 101745 | D | 110001 | 3304 | 2041 | 4819.5 |
94.74 | 101745 | D | 110001 | 3304 | 2211 | 96390 |
.52 | 101745 | D | 110001 | 3013 | 2041 | 535.5 |
If instead, you have the sample data:
Insert into PLAY_TABLE (MY_BUDGET,VCOAS,VFUND,VORGN,VACCT,TOTAL_BUDGET) values (1000,'D','110001','3013','2041',3000);
Insert into PLAY_TABLE (MY_BUDGET,VCOAS,VFUND,VORGN,VACCT,TOTAL_BUDGET) values (1000,'D','110001','3304','2041',3000);
Insert into PLAY_TABLE (MY_BUDGET,VCOAS,VFUND,VORGN,VACCT,TOTAL_BUDGET) values (1000,'D','110001','3304','2211',3000);
Where the values are all equal and would have 33.33% each (for a total of 99.99%), then the output will be:
CALCULATED_BUDGET | TOTAL_BUDGET | VCOAS | VFUND | VORGN | VACCT | MY_BUDGET |
---|---|---|---|---|---|---|
33.34 | 3000 | D | 110001 | 3013 | 2041 | 1000 |
33.33 | 3000 | D | 110001 | 3304 | 2211 | 1000 |
33.33 | 3000 | D | 110001 | 3304 | 2041 | 1000 |
And an extra 0.01% has been added.
CodePudding user response:
You could use window functions to calculate the running total of the percentages, and also the previous running total, for which you need to apply some order; you don't have a unique ID here so I've used the my_budget
value which works for the example (but might have issues with duplicated values):
SELECT round((my_budget / total_budget) * 100, 2) calculated_budget,
sum(round((my_budget / total_budget) * 100, 2))
over (order by my_budget) as running_total,
sum(round((my_budget / total_budget) * 100, 2))
over (order by my_budget rows between unbounded preceding and 1 preceding)
as previous_total,
total_budget,
vcoas,
vfund,
vorgn,
vacct,
my_budget
FROM play_table
CALCULATED_BUDGET | RUNNING_TOTAL | PREVIOUS_TOTAL | TOTAL_BUDGET | VCOAS | VFUND | VORGN | VACCT | MY_BUDGET |
---|---|---|---|---|---|---|---|---|
.53 | .53 | null | 101745 | D | 110001 | 3013 | 2041 | 535.5 |
4.74 | 5.27 | .53 | 101745 | D | 110001 | 3304 | 2041 | 4819.5 |
94.74 | 100.01 | 5.27 | 101745 | D | 110001 | 3304 | 2211 | 96390 |
Then with those in a subquery (CTE or inline view) you can use a case expression to decide whether to use the actual calculated_budget
, or to subtract the previous running total from 100 to make sure the final value no longer pushes you over 100:
SELECT case
when running_total > 100
then 100 - previous_total
else calculated_budget
end as calculated_budget,
total_budget,
vcoas,
vfund,
vorgn,
vacct,
my_budget
FROM (
SELECT round((my_budget / total_budget) * 100, 2) calculated_budget,
sum(round((my_budget / total_budget) * 100, 2))
over (order by my_budget) as running_total,
sum(round((my_budget / total_budget) * 100, 2))
over (order by my_budget rows between unbounded preceding and 1 preceding)
as previous_total,
total_budget,
vcoas,
vfund,
vorgn,
vacct,
my_budget
FROM play_table
)
CALCULATED_BUDGET | TOTAL_BUDGET | VCOAS | VFUND | VORGN | VACCT | MY_BUDGET |
---|---|---|---|---|---|---|
.53 | 101745 | D | 110001 | 3013 | 2041 | 535.5 |
4.74 | 101745 | D | 110001 | 3304 | 2041 | 4819.5 |
94.73 | 101745 | D | 110001 | 3304 | 2211 | 96390 |
CodePudding user response:
After adding an identity to your table, as you need an order for the following code
You can calculate the running sum, and exclude or select the values that are bigger than the total_budget
but wouldn't it better to now allow inserts that give a over stepping of the budget in a trigger?
-------------------------------------------------------
-- DDL for Table PLAY_TABLE
--------------------------------------------------------
CREATE TABLE PLAY_TABLE
(ID NUMBER GENERATED by default on null as IDENTITY,
MY_BUDGET NUMBER(11,2),
VCOAS VARCHAR2(1 CHAR),
VFUND VARCHAR2(6 CHAR),
VORGN VARCHAR2(6 CHAR),
VACCT VARCHAR2(6 CHAR),
TOTAL_BUDGET NUMBER(11,2)
)
Insert into PLAY_TABLE (MY_BUDGET,VCOAS,VFUND,VORGN,VACCT,TOTAL_BUDGET) values (535.5,'D','110001','3013','2041',101745);
1 rows affected
Insert into PLAY_TABLE (MY_BUDGET,VCOAS,VFUND,VORGN,VACCT,TOTAL_BUDGET) values (4819.5,'D','110001','3304','2041',101745);
1 rows affected
Insert into PLAY_TABLE (MY_BUDGET,VCOAS,VFUND,VORGN,VACCT,TOTAL_BUDGET) values (96390,'D','110001','3304','2211',101745);
1 rows affected
WITH CTE as
(SELECT round((my_budget / total_budget) * 100, 2) calculated_budget,
total_budget,
vcoas,
vfund,
vorgn,
vacct,
my_budget,
SUM(my_budget) OVER( PARTITION BY VFUND ORDER BY ID) sum_
FROM PLAY_TABLE)
SELECT
* FROM CTE WHERE total_budget > sum_
CALCULATED_BUDGET | TOTAL_BUDGET | VCOAS | VFUND | VORGN | VACCT | MY_BUDGET | SUM_ |
---|---|---|---|---|---|---|---|
.53 | 101745 | D | 110001 | 3013 | 2041 | 535.5 | 535.5 |
4.74 | 101745 | D | 110001 | 3304 | 2041 | 4819.5 | 5355 |