TLDR: Can't pass stack traces in multiprocessing.Queue
, and when I pass errors in that queue, their stack traces are stripped out once received in the other worker (either thread or process) that consumes the Queue. How could I pass the stack trace so as to restore it at the other side of the Queue in its consumer?
Long question:
Ok so suppose I have two workers in Python. The two workers are of the same type. That means they could be either a Thread or a Process:
from multiprocessing import Process
from threading import Thread
Okay, suppose it's a Thread, for now, to make things simpler, but I want things to be interchangeable with a Process and work equally as well.
So I'm trying to have nice stack traces to report errors properly. I use a multiprocessing.Queue
to pass data between workers, with the final main thread that polls the queue to get everything so as to join at the end and close everything.
If the main thread receives an exception from the queue, then I want it to re-raise it.
HOWEVER, the exception has no stack trace remaining once consumed in the main thread. And the queue seems to break when I pass a stack trace. I want to re-raise everything in the main thread to make things debuggable and loggable with as much details as possible because I need to be working with this and have some levels of details.
What can I do?
CodePudding user response:
You cannot pass stack traces into queues because they are not picklable, see this issue.
However, if you only want to re-raise an exception with it's original traceback, there are hacks around that that use the fact that BaseExceptions
and it's subclasses are picklable. What you want to do is then pass the type of exception that was raised, along with the stringified traceback message. Then you are free to raise it with the traceback from the other end:
import traceback
from multiprocessing import Queue, Process
def producer(q):
try:
raise RuntimeError('oh noo')
except Exception as e:
msg = "{}\n\nOriginal {}".format(e, traceback.format_exc())
q.put((e, msg))
if __name__ == "__main__":
q = Queue()
w = Process(target=producer, args=(q, ))
w.start()
w.join()
exc = q.get()
raise type(exc[0])(exc[1])
Note: There is a reason why there is an additional Original Traceback
in the message. This is because exception tracebacks raised in child processes can get lost and only result in a multiprocessing.RemoteError when dealing with managers and proxies. From the documentation of multiprocessing.BaseProxy:
If an exception is raised by the call, then is re-raised by _callmethod(). If some other exception is raised in the manager’s process then this is converted into a RemoteError exception and is raised by _callmethod().
Therefore, sending the original traceback can be vital in debugging.