Trying to clarify my understanding of threading.
To my understanding, when the GIL isn't released manually (e.g. time.sleep), the OS assigns it to threads randomly right? If so, how come no race condition ever occurs here? I've re-run the code many times and the ending value logged (last line) is always 5
.
I would've thought that at some of the runs, some threads would have gotten a local_copy
before another thread updated self.value
, leading to a race condition. Furthermore, doesn't this mean that the threads run synchronously since each thread waits for the previous thread to write self.value = local_copy
?
My thought process is that the OS has some process of identifying a read-write process of a shared attribute, and so assigns GILs to threads in a way that prevents any race condition from happening.
class FakeDatabase:
def __init__(self):
self.value = 0
def update(self, name):
logging.info("Thread %s: starting update", name)
local_copy = self.value
local_copy = 1
self.value = local_copy
logging.info("Thread %s: finishing update", name)
if __name__ == '__main__':
logging.basicConfig(level=logging.INFO)
database = FakeDatabase()
with futures.ThreadPoolExecutor() as executor:
for i in range(5):
executor.submit(database.update, i)
logging.info(f'Ending value is {database.value}')
CodePudding user response:
You only do one increment per thread, and the increment is a relatively short part, so it's likely that the increments don't overlap. If I make each thread increment a million times with
for _ in range(10**6):
local_copy = self.value
local_copy = 1
self.value = local_copy
then I do see it happen (Try it online!):
INFO:root:Thread 0: starting update
INFO:root:Thread 1: starting update
INFO:root:Thread 2: starting update
INFO:root:Thread 3: starting update
INFO:root:Thread 4: starting update
INFO:root:Thread 4: finishing update
INFO:root:Thread 0: finishing update
INFO:root:Thread 3: finishing update
INFO:root:Thread 2: finishing update
INFO:root:Thread 1: finishing update
INFO:root:Ending value is 1745053
CodePudding user response:
Python's threading with CPU bound tasks like in your example case above behaves effectively in synchronous fashion under the hood, the real benefit of threads in Python is with I/O bound tasks like waiting for something from the network or disk.