Home > Software engineering >  When using multiprocessing and spawn in python, use self.a in __getattr__ cause infinite loop
When using multiprocessing and spawn in python, use self.a in __getattr__ cause infinite loop

Time:09-10

The following code will recurrent the bug:

from multiprocessing import Process, set_start_method


class TestObject:
    def __init__(self) -> None:
        self.a = lambda *args: {}

    def __getattr__(self, item):
        return self.a

class TestProcess(Process):
    def __init__(self, textobject, **kwargs):
        super(TestProcess, self).__init__(**kwargs)
        self.testobject = textobject

    def run(self) -> None:
        print("heihei")
        print(self.testobject)


if __name__ == "__main__":
    set_start_method("spawn")

    testobject = TestObject()
    testprocess = TestProcess(testobject)
    testprocess.start()

Using 'spawn' will cause infinite loop in the method if 'TestObject.__getattr__'. When delete the line 'set_start_method('spawn')', all things go right.

It would be very thankful of us to know why the infinite loop happen.

CodePudding user response:

If you head over to pickle's documentation, you will find a note that says

At unpickling time, some methods like getattr(), getattribute(), or setattr() may be called upon the instance. In case those methods rely on some internal invariant being true, the type should implement new() to establish such an invariant, as init() is not called when unpickling an instance.

I am unsure of what exact conditions leads to a __getattribute__ call, but you can bypass the default behaviour by providing a __setstate__ method:

class TestObject:
    def __init__(self) -> None:
        self.a = lambda *args: {}

    def __getattr__(self, item):
        return self.a

    def __setstate__(self, state):
        self.__dict__ = state

If it's present, pickle calls this method with the unpickled state and you are free to restore it however you wish.

  • Related