I have a vector of random positive integers. I would like to select only those elements of the vector whose sum will be exactly equal to a certain predetermined value.
Let's take an example like this.
x=1:5
, I am looking for elements whose sum is equal to 14
. The solution is of course the vector c(2, 3, 4, 5)
.
Of course, there may be several solutions.
Example 2.
x=1:5
, I'm looking for elements whose sum is equal to 7
.
Here, of course, should be the following three solutions:
1.c(2, 5)
,
2.c(3, 4)
,
3.c(1, 2, 4)
.
There may also be a situation where there will be no solutions at all.
Example 3.
x=c(1, 2, 7)
, I'm looking for elements whose sum equals 5
.
Of course, there are no correct solutions here.
Everything seems trivially simple if we have vectors of several elements. Here, I even came up with a few alternative solutions. However, the problem becomes when the size of the vector increases.
My vector looks like this:
x= c(236L, 407L, 51L, 308L, 72L, 9787L, 458L, 5486L, 42L, 4290L,
31L, 3533L, 1102L, 24L, 100L, 669L, 9352L, 4091L, 2751L, 3324L,
3193L, 245L, 86L, 98932L, 77L, 13L, 9789L, 91L, 999L, 25L, 25379L,
9626L, 9092L, 622L, 97L, 57L, 2911L, 6L, 405L, 894L, 1760L, 9634L,
96L, 9765L, 223L, 765L, 743L, 5960L, 14L, 50L, 89L, 348L, 5875L,
5L, 58602L, 397L, 1181L, 94L, 954L, 7901L, 836L, 8810L, 52L,
15L, 48L, 26L, 4L, 66L, 5265L, 80L, 282L, 231L, 76L, 661L, 7604L,
7406L, 58L, 10L, 903L, 49446L, 80921L, 1L, 876L, 334L, 63L, 796L,
88L, 413L, 1214L, 2983L, 9518L, 595L, 708L, 53L, 321L, 12L, 634L,
4910L, 8984L, 465L)
I have to find at least one subset of elements whose sum will be exactly 23745
.
Unfortunately, I had a complete failure here. Whatever I write is calculated in hours and I don't get any correct result anyway. Does anyone have any idea how this can be solved in R? I will be grateful for even a small hint.
CodePudding user response:
This task sounds like a 1 dimensional bin packing problem or knapsack problem, in which case there are many resources available online to help guide you.
One potential solution is to use the gbp package, e.g.
#install.packages("gbp")
library(gbp)
#> Loading required package: magrittr
#> Loading required package: data.table
x <- c(236L, 407L, 51L, 308L, 72L, 9787L, 458L, 5486L, 42L, 4290L,
31L, 3533L, 1102L, 24L, 100L, 669L, 9352L, 4091L, 2751L, 3324L,
3193L, 245L, 86L, 98932L, 77L, 13L, 9789L, 91L, 999L, 25L, 25379L,
9626L, 9092L, 622L, 97L, 57L, 2911L, 6L, 405L, 894L, 1760L, 9634L,
96L, 9765L, 223L, 765L, 743L, 5960L, 14L, 50L, 89L, 348L, 5875L,
5L, 58602L, 397L, 1181L, 94L, 954L, 7901L, 836L, 8810L, 52L,
15L, 48L, 26L, 4L, 66L, 5265L, 80L, 282L, 231L, 76L, 661L, 7604L,
7406L, 58L, 10L, 903L, 49446L, 80921L, 1L, 876L, 334L, 63L, 796L,
88L, 413L, 1214L, 2983L, 9518L, 595L, 708L, 53L, 321L, 12L, 634L,
4910L, 8984L, 465L)
test <- gbp1d_solver_dpp(p = x, w = x, c = 23745L)
list_of_selected_items <- x[as.logical(test$k)]
list_of_selected_items
#> [1] 236 51 308 458 5486 4290 31 3533 9352
sum(list_of_selected_items)
#> [1] 23745
Created on 2021-10-18 by the reprex package (v2.0.1)