Home > OS >  Selecting such vector elements so that the sum of elements is exactly equal to the specified value
Selecting such vector elements so that the sum of elements is exactly equal to the specified value

Time:10-19

I have a vector of random positive integers. I would like to select only those elements of the vector whose sum will be exactly equal to a certain predetermined value.

Let's take an example like this. x=1:5, I am looking for elements whose sum is equal to 14. The solution is of course the vector c(2, 3, 4, 5).

Of course, there may be several solutions. Example 2. x=1:5, I'm looking for elements whose sum is equal to 7. Here, of course, should be the following three solutions:
1.c(2, 5),
2.c(3, 4),
3.c(1, 2, 4).

There may also be a situation where there will be no solutions at all. Example 3. x=c(1, 2, 7), I'm looking for elements whose sum equals 5. Of course, there are no correct solutions here.

Everything seems trivially simple if we have vectors of several elements. Here, I even came up with a few alternative solutions. However, the problem becomes when the size of the vector increases.

My vector looks like this:

x= c(236L, 407L, 51L, 308L, 72L, 9787L, 458L, 5486L, 42L, 4290L,
  31L, 3533L, 1102L, 24L, 100L, 669L, 9352L, 4091L, 2751L, 3324L,
  3193L, 245L, 86L, 98932L, 77L, 13L, 9789L, 91L, 999L, 25L, 25379L,
  9626L, 9092L, 622L, 97L, 57L, 2911L, 6L, 405L, 894L, 1760L, 9634L,
  96L, 9765L, 223L, 765L, 743L, 5960L, 14L, 50L, 89L, 348L, 5875L,
  5L, 58602L, 397L, 1181L, 94L, 954L, 7901L, 836L, 8810L, 52L,
  15L, 48L, 26L, 4L, 66L, 5265L, 80L, 282L, 231L, 76L, 661L, 7604L,
  7406L, 58L, 10L, 903L, 49446L, 80921L, 1L, 876L, 334L, 63L, 796L,
  88L, 413L, 1214L, 2983L, 9518L, 595L, 708L, 53L, 321L, 12L, 634L,
  4910L, 8984L, 465L)

I have to find at least one subset of elements whose sum will be exactly 23745. Unfortunately, I had a complete failure here. Whatever I write is calculated in hours and I don't get any correct result anyway. Does anyone have any idea how this can be solved in R? I will be grateful for even a small hint.

CodePudding user response:

This task sounds like a 1 dimensional bin packing problem or knapsack problem, in which case there are many resources available online to help guide you.

One potential solution is to use the gbp package, e.g.

#install.packages("gbp")
library(gbp)
#> Loading required package: magrittr
#> Loading required package: data.table
x <- c(236L, 407L, 51L, 308L, 72L, 9787L, 458L, 5486L, 42L, 4290L,
     31L, 3533L, 1102L, 24L, 100L, 669L, 9352L, 4091L, 2751L, 3324L,
     3193L, 245L, 86L, 98932L, 77L, 13L, 9789L, 91L, 999L, 25L, 25379L,
     9626L, 9092L, 622L, 97L, 57L, 2911L, 6L, 405L, 894L, 1760L, 9634L,
     96L, 9765L, 223L, 765L, 743L, 5960L, 14L, 50L, 89L, 348L, 5875L,
     5L, 58602L, 397L, 1181L, 94L, 954L, 7901L, 836L, 8810L, 52L,
     15L, 48L, 26L, 4L, 66L, 5265L, 80L, 282L, 231L, 76L, 661L, 7604L,
     7406L, 58L, 10L, 903L, 49446L, 80921L, 1L, 876L, 334L, 63L, 796L,
     88L, 413L, 1214L, 2983L, 9518L, 595L, 708L, 53L, 321L, 12L, 634L,
     4910L, 8984L, 465L)

test <- gbp1d_solver_dpp(p = x, w = x, c = 23745L)

list_of_selected_items <- x[as.logical(test$k)]
list_of_selected_items
#> [1]  236   51  308  458 5486 4290   31 3533 9352
sum(list_of_selected_items)
#> [1] 23745

Created on 2021-10-18 by the reprex package (v2.0.1)

  • Related