I discovered I am getting different values when checking qsize() vs the queue's unfinished_size variable. In what cases would there be a large difference between these results?
from queue import Queue
dbjobs = Queue()
...
...
...
print("qsize() = " str(dbjobs.qsize()))
print("unfinished_tasks = " str(dbjobs.unfinished_tasks))
Example Result 1:
qsize() = 0
unfinished_tasks = 79
Example Result 2:
qsize() = 2
unfinished_tasks = 117
CodePudding user response:
I assume you mean the standard library's queue.Queue
(not e.g. asyncio.queue.Queue
or something else).
As far as I know Queue.unfinished_tasks
is not documented, so I would advise against using it on principle.
Assuming it does behave as can reasonably be expected, when reading the documentation for Queue.join
, the attribute unfinished_tasks
seems to be a counter that goes up by one, whenever a new item is put into the queue and down by one when the Queue.task_done
method is called.
As for Queue.qsize
, that just returns the (approximate) number of items in the queue. Meaning that number decreases when an item is taken out of the queue (using Queue.get
for example). Doing that should have no effect on the unfinished_tasks
.
You can think of it as answering different questions. The qsize
method answers the relatively straightforward question: "How many items are in the queue right now?"
The unfinished_tasks
counter presumably answers the question: "How many items that were put into the queue at some point are still in it or are currently being worked on by consumers of the queue?" Although this is much less precise, since it is entirely possible that a consumer of the queue takes an item out, but never calls task_done
afterwards (because he crashed for example).