How to round a vector of values so that the sum of the rounded values will always add up to a certai-CodePudding

Let's assume I have the following vector of values:

input <- c(5.170669324, 8.69978618, 8.386448608, 9.765707038, 9.725934478, 6.37837945, 1.770207313, 4.64951078, 7.69234579, 8.060673651, 9.656558295, 9.933008843, 7.141666039, 2.969104213)

I now want to round these values to zero digits, which would give me:

round(input, 0)

[1]  5  9  8 10 10  6  2  5  8  8 10 10  7  3

Problem is, these numbers now add up to 101, but I need to find a rounding that will give me exactly 100.

My thought was that I could calculate the difference of the original values to the rounded values and then sequentially start with the value that would closest to x.5 and then switch its direction, i.e. in my case I have 101 as a sum, I need to go down to 100, so I will change the value closest to and higher than .5 (here, the 4.64...) to the lower number, so this number wouldn't be rounded to 5, but isntead to 4. If the sum would have been 99, I would do it the other way around.

Does that sound like a reasonable approach or is there a mathematically more appropriate method?

CodePudding user response：

I think this algorithm does what you need.

If the sum of the rounded values is more than the target, it rounds down the most appropriate number (the one which was furthest from the integer to which it was rounded up). It repeats this process if required until the target is reached. If the sum of the rounded values is smaller than the target, it will round up the most appropriate numbers until the target is reached:

round_to_target <- function(x, target = round(sum(x))) {

  while(sum(round(x)) - target > 0) {
    i <- which.min(ifelse(x %% 1 < 0.5, 1, x %% 1))
    x[i] <- x[i] - 1
  }
  while(sum(round(x)) - target < 0) {
    i <- which.max(ifelse(x %% 1 > 0.5, 0, x %% 1))
    x[i] <- x[i]   1
  }
  round(x)
}

Testing on your example, we have

round_to_target(input)
#> [1]  5  9  8 10 10  6  2  4  8  8 10 10  7  3

sum(round_to_target(input))
#> [1] 100

And to show it is a general solution, let's generate some random numbers that add up to 200:

set.seed(1)

x <- diff(sort(c(0, runif(14, 0, 200), 200)))

x
#>  [1] 12.3572541 22.9540964  5.0250357  0.8585288 11.9068176 21.3230473
#>  [7]  2.3959637 37.7499290 11.2521361  6.3367497  5.2450108 42.2733677
#> [13]  1.9636210  7.2934957 11.0649463

Note that summing the rounded values gives us less than 200:

sum(round(x))
#> [1] 198

But if we use round_to_target, we get

round_to_target(x)
#>  [1] 12 23  5  1 12 21  4 38 11  6  5 42  2  7 11

sum(round_to_target(x))
#> [1] 200