I was trying to write a simple program that would calculate the percentage for each of the values in the list/array, but I ran into a problem where the sum of the calculated percentages is sometimes more than 100 or less.
double[] arrayDouble = new double[5];
double totalSumm = 0;
for(int i = 0; i < arrayDouble.Length; i )
{
Random random = new Random();
arrayDouble[i] = random.Next(0, 500);
totalSumm = arrayDouble[i];
Console.Write(arrayDouble[i].ToString().PadLeft(8) " ");
}
Console.WriteLine($"\nTotal summ is {totalSumm}");
double[] arrayPercentage = new double[5];
double totalPercentage = 0;
for(int i = 0; i < arrayPercentage.Length; i )
{
arrayPercentage[i] = Math.Round(((arrayDouble[i] / totalSumm) * 100), 2);
totalPercentage = arrayPercentage[i];
Console.Write(arrayPercentage[i].ToString().PadLeft(7) "% ");
}
Math.Round(totalPercentage, 2);
Console.WriteLine($"\nTotal percentage: {totalPercentage.ToString()}%");
Sometimes you can get the following result
# | [0] | [1] | [2] | [3] | [4] | Total |
---|---|---|---|---|---|---|
Value | 327 | 103 | 383 | 225 | 269 | 1307 |
Percentage | 25,02% | 7,88% | 29,3% | 17,21% | 20,58% | 99,99% |
What is the correct way to calculate to avoid such situations?
CodePudding user response:
What you encounter is a so called rounding difference. When calculating the percentage of a single item and rounding it afterwards, some of the items will be rounded down, others up. When cumulating them, the sum of those rounded percentages might be 100.00%, in many cases the result will not be exactly 100.00%.
How you handle those differences depends on your requirements and on how accurate the results should be:
- As @GSerg proposed, you can simply round the result too, if it is only for display and you do not have too many users that might sum up the percentages themselves. They might consider it a bug, if the displayed total differs from the "real" total they calculated.
- In other cases, there are rules on how to handle those differences. Such an environment is financial accounting for instance. A way to handle rounding differences is to calculate the difference of the sum of the percentages to 100% and then correct the biggest entry by the rounding difference. In your case, you'd end up with the following values in your sample:
# | [0] | [1] | [2] | [3] | [4] | Total |
---|---|---|---|---|---|---|
Value | 327 | 103 | 383 | 225 | 269 | 1307 |
Percentage | 25,02% | 7,88% | 29,31% | 17,21% | 20,58% | 100,00% |
Please note that I've added 0.01 to the item with index 2, thus reaching the expected total of 100%. Of course, one might argue that changing the values is not the correct way to go, but - as you observed - the difference is very small and the problem can not be avoided when rounding numbers - so it has to be solved in a determinstic way.
CodePudding user response:
First of all, to deal with the effect (rounding errors), let's get rid of randomness:
// predefined values; no randomness
double[] arrayDouble = new double[] {250, 250, 250, 250, 251};
double totalSumm = 0;
for (int i = 0; i < arrayDouble.Length; i )
{
// no randomness here
totalSumm = arrayDouble[i];
Console.Write(arrayDouble[i].ToString().PadLeft(8) " ");
}
Console.WriteLine($"\nTotal summ is {totalSumm}");
...
What we have:
totalSumm == 1251.0
250.0 / 1251.0 == 0.199840127... ~ 19.98% (note, that we drop ~ 0.004% on rounding)
251.0 / 1251.0 == 0.200639488... ~ 20.06% (note, that we drop ~ 0.004% on rounding)
total % will be 4 * 19.98 20.06 == 99.98 (note, 0.02% lost)
# | [0] | [1] | [2] | [3] | [4] | Total |
---|---|---|---|---|---|---|
Value | 250 | 250 | 250 | 250 | 251 | 1251 |
Percentage | 19,98% | 19,98% | 19,98% | 19,98% | 20,06% | 99,98% |
What can we do:
- We can try to adjust per cent values, e.g. add
0.01
to each250
column and remove0.02
from251
# | [0] | [1] | [2] | [3] | [4] | Total |
---|---|---|---|---|---|---|
Value | 250 | 250 | 250 | 250 | 251 | 1251 |
Percentage | 19,99% | 19,99% | 19,99% | 19,99% | 20,04% | 100.00% |
note, that we have to cook the numbers.
Another possibility is adjust difference (we sum up the total error and add it to the column):
# | [0] | [1] | [2] | [3] | [4] | Total |
---|---|---|---|---|---|---|
Value | 250 | 250 | 250 | 250 | 251 | 1251 |
Percentage | 19,98% | 19,99% | 19,98% | 19,99% | 20,06% | 100.00% |
Note, that columns with the same value (250
) can have different per cents (19.98
and 19.99
)