I have been doing an "Algorithms and Data Structures" course recently and have come across an interesting exercise.
Consider a system which has multiple a
executors, b
queues and c
tasks. a
,b
,c
are all below 10^5. In the beginning of the program there is a user input of those three numbers. After that there are c
lines which describe tasks. Each task is represented by 3 numbers: the time it comes to the system, the queue at which end it stands to and the time it takes to complete.
As for the system itself it works the following way: every second (unit of a time) we must pick a free executor. If there are more than 1, we choose the one with the lower number (every executor has a number from 1 to a
). After that we must choose a task from the front of some non-empty queue. If at this time there are multiple non-empty queues waiting to get picked, then we pick from the queue which has not been taken from for the longest time by current time (for example, if there are tasks in queues 1 and 2 and before that we have taken from queue 1 at time 3 and from queue 2 at time 1, then we take a task from queue 2). After that the task is assigned to the executor we selected and the executor becomes occupied for the time it takes to complete the task. After that he is ready to take another task (he can do it at the very moment he finishes his current task).
So the final question of the problem is for every task to print which executor will take it for processing and at which moment in time.
Also, the solution must run undex 10 seconds for any input. That makes me think that its time complexity must be below O(N^2). So, I'm thinking about O(N*logN) and actually I'm quite close to it.
So far my solution if the following:
First we select an executor. We can do it in logarithmic time if we use a minimum heap to store executors. To compare executors we can for each of them track time when they get free after completing another task, then compare those times. If the times are equal, then we compare executors' numbers. In the beginning every executor is free at time moment 0 (tasks start to arrive at time moment 1). So then for every task we can pick an executor which will be ready first.
Also we can notice that at any given time moment we don't need all the tasks in all queues, only those at front of each queue. We can store them in an array, let's call it tasksArray
.
So, we can iterate over every queue and if the task in front is arriving before the picked executor gets free then we store this task in a separate array. This array will contain all the tasks which are able for the taking by the chosen executor as soon as it gets free from his previous task.
In case there are no tasks found that means that the executor has to wait until some tasks come. To determine which tasks will these be we can one again iterate over all queues and find the tasks which arrive soonest. There may be more than one task (2 tasks may arrive at the same time to the different queues), so we store them as an array.
Both described ways of picking tasks are O(N). So far it seems okay.
After choosing tasks using one of the ways above we can sort them by the time it was last taken a task from the queue they come from (task from queue which was waiting for the longest comes first and then comes the task from the second-longest-waiting queue etc)
After that we can go over the sorted tasksArray
and for every task we assign it to the executor (we picked one in the beginning), the executors gets occupied and we put it back into the executors heap with an updated time record, which indicates the moment it gets free. Then for each next element of the tasksArray
we get an executor from the heap and assign the task to it and so it goes until the tasksArray
is empty. Such iteration is O(NlogN). So far the solution is O(N NlogN).
After all tasks from tasksArray
are assigned to their executors we acquire the next batch of the tasks from the front of the queues and everything goes on again. That continues until all queues are empty.
This solution can be considered O(N*logN), however this solution is incorrect.
Lets go back to the case when the executor has gotten ready but there are no tasks yet. In this case it waits until the first tasks from one or multiple queues at the same time come and picks one of them. But while it waits, other executors may finish their tasks and become ready. Some of those executors may have a smaller number than the one we picked. So, here comes the problem, because in such case we must pick the one with the smallest number, however my solution doesn't do it.
And there is a reason for that. If we store executors in a heap, then we cannot know which executors have become ready while we were waiting. To determine this we will need to check every executor in the heap and compare its 'ready-time' with the current time. However, such check gives us O(N) time (we can iterate over the underlying array of the queue, not the queue itself since we need to check all the tasks anyway). In the worst case when there comes only 1 task every 2 seconds and it has a small processing time of 1 second, then we will fall into the situation of executors waiting for each task. And for each task we will iterate over all executors. Such algorithm is O(N*N) = O(N^2) which is unacceptable because of time limits.
And this is where I am stuck. I cannot find a way to determine whether more executors have become ready while we were waiting for the task in time under O(N). I have also been thinking about changing the data structure from heap to some other structure, but that gives me no luck either.
So is there any way this exercise can be solved in under O(N^2) time?
CodePudding user response:
You have not fully specified the problem for ties. For example if a task arrives to an empty queue that has not been picked from for a long time at the same moment that an executor comes free, does that task get assigned to the executor? Or does some other task get assigned to that executor and the task gets assigned to the next executor to come free?
I'll assume it gets assigned to the executor, and won't deal with other ties. But you should think through that carefully in your solution.
We need the following heaps.
free_executor
so we can pick the free executor with the lowest numbers.nonempty_queue
so we can pick from the queue that has been waiting longest.- Event. An event has a time, type, and information. The types are
TaskArrival
andExecutorFinished
. The order is time, then type (withTaskArrival
first).
And now with that, here is the logic.
while events has stuff:
event = next event
if event.type = TaskArrival:
if its queue was empty:
add its queue to nonempty_queue
add task to its queue
else:
add newly freed executor to free_executor
if free_executor and nonempty_queue both have something:
executor = free_executor.pop()
queue = nonempty_queue.pop()
task = queue.pop()
mark that executor did task
add new event to events queue for executor coming free after task is done
if queue is not empty:
put queue back into nonempty_queue with new priority
We will generate 2 events per task. Every operation inside of the loop takes O(1)
(eg mark that executor did task
) or O(log(n))
(eg queue = nonempty_queue.pop()
). There are a fixed number of operations per event. Therefore the algorithm runs in time O(n log(n))
.