Home > database >  Knapsack problem: choose 5 elements among 100, weight <= 100, maximise total value
Knapsack problem: choose 5 elements among 100, weight <= 100, maximise total value

Time:01-04

I have a data table containing n=100 elements :

Element weight value
A 24 80
B 43 77
C 51 72
D 38 70
E 27 65
F 7 58
.. .. ..

And I would like to create an algorithm to get 5 elements where :

  • sum of the 5 weights is capped to 100
  • sum of the 5 values should be the maximum possible

I'm on google sheet but I don't know how to do it properly.

I tried to iterate on every element but it was not really effective...

CodePudding user response:

I'm not familiar enough with google sheets, but here is a simple recursive function with memoization in python, using the following recurrence formula:

if weight(n) <= w:
    T(n, k, w) = max(
                     T(n-1, k, w),
                     value(n)   T(n-1, k-1, w-weight(n))
                 )
else:
    T(n, k, w) = T(n-1, k, w)

where

  • T(n, k, w) is the maximum sum of k values whose weights sum up to at most w, using only items up to the nth item
  • T(100, 5, 100) is the overall solution to your problem
  • if the best solution for (n, k, w) doesn't use the nth item, then it's equal to the best solution for (n-1, k, w)
  • if the best solution for (n, k, w) uses the nth item, then it's equal to the the value of the nth item plus the value of the best solution for (n-1, k-1, w-weight(n))

In python:

import numpy as np

n_items_total, k_items_solution, max_weight = 100, 5, 100

data = np.random.randint(0, max_weight 1, (n_items_total,2)))

def knapsack(n, k, w):
    if (n, k, w) not in knapsack.t:
        if w < 0 or n < 0 or k < 0:
            knapsack.t[(n, k, w)] = -1000  # big negative number, large enough that solution is invalid
        elif n < k:
            knapsack.t[(n, k, w)] = -1000 # presuming you want exactly k items; remove this line if <= k is okay
        elif k == 0:
            knapsack.t[(n, k, w)] = 0
        else:
            knapsack.t[(n, k, w)] = max(knapsack(n-1, k, w), data[n-1, 0]   knapsack(n-1, k-1, w-data[n-1,1]))
    return knapsack.t[(n, k, w)]
knapsack.t = {}

def traceback_solution(t, n, k, w):
    if k <= 0:
        return
    s = t[(n, k, w)]
    a = t[(n-1, k, w)]
    b = data[n-1, 0]   t[(n-1, k-1, w-data[n-1, 1])]
    if s == a:
        yield from traceback_solution(t, n-1, k, w)
    elif s == b:
        yield (n-1, data[n-1])
        yield from traceback_solution(t, n-1, k-1, w-data[n-1, 1])
    else:
        raise Error

best_score = knapsack(n_items_total, k_items_solution, max_weight)
solution = list(traceback_solution(knapsack.t, n_items_total, k_items_solution, max_weight))

print(solution)
# [(87, array([97,  2])),
#  (84, array([97, 29])), 
#  (63, array([98, 11])), 
#  (36, array([98, 32])), 
#  (0, array([99,  6]))]

CodePudding user response:

Knapsack problems are tricky to tackle with a plain vanilla Google Sheets formula, because storing intermediate results gets complicated, and because recursion is not available unless you create a named function. You may want to implement Stef's solution as a custom function, or if the 30-second runtime limit is not sufficient to process your dataset, as a regular function you start manually through a button or menu item.

Nevertheless, here's a simple illustration of one approach with a plain vanilla formula that takes advantage of the characteristics of the sample data you show:

=lambda( 
  weightedData, 
  query( 
    { 
      weightedData, 
      scan( 
        0, index(weightedData, 0, 2), 
        lambda( 
          acc, curr, 
          acc   n(curr) 
        ) 
      ) 
    }, 
    "select Col1, Col2, Col3 
     where Col4 <= 100 
     order by Col3 desc", 0 
  ) 
)( 
  sort( 
    A2:C, 
    C2:C / B2:B, 
    false 
  )
)

You can cap the number of elements to 5 by adding limit 5 in the query() statement, but that will simply omit elements rather than recompute with n=5. Please understand that the results will not be optimal — this is for illustration only.

  • Related