Problem statement
The algorithm receives a list of plank lengths [ L0, ..., L_n-1 ]
and a list of desired cut pieces lengths [M0, ..., M_m-1]
. It returns a vector [P0, ..., P_m-1]
where p_j
is the index of the plank where the piece j
is cut from. P_j = -1
if that piece isn't cut.
The greedy algorithm is:
Choose the biggest piece
j
and the longest planki
;Remove the piece
j
from the list of desired cut pieces.If
M_j > L_i
, then piecej
isn't cut. So we setP_j = -1
and return to step 1.We cut the piece
j
from the planki
so its length becomesL_i - M_j
. We setP_j = i
;If there's pieces left to cut, go back to step 1.
The problem is to implement this greedy algorithm in O(m log n)
time using a heap.
It's given that m >= n
.
Question
So my question is, using a heap and seeing that we need the biggest piece j
and the longest plank i
, something tells me that we need to sort the vectors hence I don't see how we can achieve this algorithm in O(m log n)
.
Could someone enlighten me on how this can be done in O(m log n)
time?
I think the use of the heap can be 1 heap for the planks and 1 for the pieces or only 1 heap but there must be at least one heap.
CodePudding user response:
If you restrict yourself to a comparison based sort then I don't think you can do any better than O(m log m)
(which is worse than O(m log n)
). This is because the first step ("choosing biggest piece j") effectively produces a sorted sequence of pieces of length m
-- as you pointed out correctly.
However, you can do this in O(m log n)
if you use a linear-time sort (e.g. radix sort) for the pieces M_j
, and a heap for the planks L_i
. The sort of the pieces would take O(m)
. You will do m
push/pop operations on the heap, whose size is bound by the number of planks n
. Thus the operations on that heap will total to O(m log n)
. Together with the previous sorting that's still O(m log n)
.
CodePudding user response:
This straightforward implementation is O(m log m).
This is unavoidable since the problem requires to process cuts from large to small. This requires sorting m
elements, which is O(m log m). There is no faster sorting algorithm.
typedef std::pair<size_t,size_t> pss;
std::vector<int> greedy(const std::vector<size_t>& L, const std::vector<size_t>& M)
{
std::vector<int> result(M.size(), -1);
// heap of pairs that sorts on pair.first, top==largest element
std::priority_queue<pss> Lheap, Mheap;
// create heaps of value => index
for (size_t i = 0; i < L.size(); i)
Lheap.push(pss{L[i],i});
for (size_t i = 0; i < M.size(); i)
Mheap.push(pss{M[i],i});
// process cuts from largest to smallest
while (!Mheap.empty() && !Lheap.empty())
{
// take longest plank & cut
auto mi = Mheap.top();
auto li = Lheap.top();
Mheap.pop();
// check if cut is too big
if (mi.first > li.first)
continue;
// write plank index to cut index
result[mi.second] = li.second;
// update heap: remove old plank, reinsert with new length
Lheap.pop();
li.first -= mi.first;
if (li.first > 0)
Lheap.push(li);
}
return result;
}
ONLY IF at least the M input vector is presorted then of course the O(m log m) cost is moved to the precondition. In that case you only need the Lheap, removing Mheap and its O(m log m) cost, you simply iterate over M. Thus you end up with O(m log n): m loop iterations with Lheap operations of cost O(log n).
CodePudding user response:
Build max heaps out of the n plank lengths and m cut lengths in O(n m) time (you can build a heap of k elements in O(k) time if you build it all at once rather than inserting elements one at a time, and you have all elements up front as inputs to the algorithm, which is why building the heaps can be done in linear time).
Now, using standard heap operations, you can remove a cut length from the cut-length heap, MH, in O(logm) time, and you can remove a plank length from the plank-length heap, LH, in O(logn) time. Also, you can decrease a plank-length value by removing it and then re-inserting the lesser value in O(logn logn) = O(2logn) = O(logn) time.
One note: in these heaps, you'll want to store indexes into the original arrays as well as lengths. Your heaps will be structured by length (longer lengths on top), but the individual elements of the heap need to be structs/pairs/objects which store a cut/plank lengths alongside an index. If you use a cpp pair, you can put lengths first and indexes second and the heap will be ordered correctly (as a max-heap sorted according to the first pair element) by default.
Revisiting the steps of the algorithm, it looks like that the best complexity you can achieve with a heap-based approach is O(mlogm) time:
Peek the top element of LH = {L_j, j} and MH = {M_i, i}: O(1)
Pop {M_i, i} from MH: O(logm)
Check if the highest cut length, M_i is greater than the highest plank length, L_j. If M_i > L_j, we can't cut that length. Set P[j] = -1 and return to step 1: O(1)
Cut length M_i from plank L_j by popping LH (O(logn)) and inserting {L_j - M_i, j} back into it (also O(logn)). Set P[j] = i (O(1)): O(logn)
If there's pieces left to cut (if !MH.empty() && !LH.empty()), go back to step 1: O(1)
Every execution of the loop, we do log(m) log(n) O(1) = O(logm) work. The loop runs at most max(n,m) = m times. So the overall complexity is O(n m mlogm) = O(mlogm). You can't do better than this unless M is pre-sorted (in which case you wouldn't need MH, and could instead just keep an index into M and the O(logm) step 2 could be replaced by O(1) {M_i, i} = M[mIdx ], reducing the overall complexity to O(m*logn)).
It looks like another answer already posted an implementation this algorithm, but I'll still add my answer as complexity analysis.