Home > Software engineering >  Python object returning different property values between threads
Python object returning different property values between threads

Time:01-13

Problem

I'm experiencing unexpected behavior when trying to update an objects properties across threads. During application start, I spawn a thread to do some I/O that I don't want to block the main thread starting up the rest of the application. I then want to communicate back to the main thread that we have finished loading, by setting a ready property to True on an object.

What happens instead is that the read property from the main thread is never updated and the application never fully initializes. In debugging, I see that both threads see the same object reference using hex(id(object)), but they both see different property values.

To ensure this is not caused by a race condition, I setup @property methods (getter and setter) to monitor all interactions on the ready property and output the context:

Thread: 274910491840 - obj id 0x4002203d60 ready property set to True
Thread: 274910491840 - obj id 0x4002203d60 ready property read as True
Thread: 275066861312 - obj id 0x4002203d60 ready property read as False
Thread: 274910491840 - obj id 0x4002203d60 ready property read as True
Thread: 274910491840 - obj id 0x4002203d60 ready property read as True
Thread: 275066861312 - obj id 0x4002203d60 ready property read as False
... etc

As we see here, the same object id reads as different values persistently across threads. I have not experienced this in other languages and think it might be a Python threading idiosyncrasy I'm not aware of?

Context

The main application is initialized as an asyncio loop, with the secondary thread spun up before initialization. I subclass an open source package called kserve, with the main thread startup logic looking like this: https://github.com/kserve/kserve/blob/master/python/kserve/kserve/model_server.py#L257-L280.

My secondary thread code is very simple and just looks like:

        def loader():
            # load things...
            obj.ready = True

        load_thread = Thread(target=loader)
        load_thread.start()

How is it possible that given object obj with the same reference id, one of its properties can return different values across threads?

Thanks for any help you can provide!

CodePudding user response:

Two things turned out to be the cause of this misunderstanding/unexpected behavior:

  1. The linked underlying ASGI container(UvicornCustomServer) code was forking a new processes to statup in.
  2. When a process forks, objects can be referenced from other processes and will share the same identity (virtual address space), but their underlying properties will be different due to having different physical memory addresses.

Thank you to @ahmed-aek for sharing their understanding here that pointed me in the right direction:

as fork just duplicates the virtual address space, you end up with 2 objects with the same virtual address but different physical address in different processes, this is just the operating system virtualizing addresses for each process, but they are two separate objects in two separate parts of the memory that can have different attributes.

  • Related