Home > OS >  Understanding multi-threading and locks in Python (concept and example)
Understanding multi-threading and locks in Python (concept and example)

Time:05-28

I did research on multi-threading for a programming project using it (first-timer here...). I would appreciate if you deemed my statements below correct or, rather, comment on the ones that are wrong or need correction.

  1. A lock is an object which can be passed to functions, methods, ... by reference. A (in this example) function can then make use of that lock object reference in order to safely operate on data (a variable in this example). It does this by acquiring the lock, modifying the variable and then releasing the lock.
  2. A thread can be created to target a function, which may obtain a reference to a lock (to then achieve what is stated above).
  3. A lock does not protect a specific variable, object etc.
  4. A lock does not protect or do anything unless it is acquired (and released).
  5. Thus, it is in the responsibility of the programmer to use the lock in order achieve the desired protection.
  6. If a lock is acquired inside a function executed by thread A, this has no immediate influence on any other running thread B. Not even if the functions targeted by threads A and B have a reference to the same lock object.
  7. Only if the function targeted by thread B wants to acquire the same lock (i.e. via the same referenced lock object), which already was acquired by the function targeted by thread A at that time, the lock conveys influence on both threads in that thread B will pause further execution until the function targeted by thread A releases the lock again.
  8. Thus, a locked lock only ever pauses execution of a thread, if its targeted function wants (and waits) to acquire the very same lock itself. Thus, by thread A acquiring the lock, it can only prevent thread B from acquiring the same lock, nothing more, nothing less.

If I want to use a lock to prevent race conditions when setting a variable, I (as the programmer) need to:

  1. pass a lock to all functions targeted by threads that will want to set the variable and
  2. acquire the lock in every function and every time before I set the variable (and release it afterwards). (*)
  3. If I create even only one thread targeting a function without providing it a reference to the lock object and let it set the variable or
  4. if I set the variable via a thread whose targeted function has the lock object, but doesn't acquire it prior to the operation, I will have failed to implement thread-safe setting of the variable.

(*) The lock should be acquired as long as the variable must not be accessed by other threads. Right now, I like to compare that to a database transaction... I lock the database (~ acquire a lock) until my set of instructions is completed, then I commit (~ release the lock).


Example If I wanted to create a class whose member _value should be set in a thread-safe fashion, I would implement one of these two versions:

    class Version1:
        def __init__(self):
            self._value:int = 0
            self._lock:threading.Lock = threading.Lock()
        
        def getValue(self) -> int:
            """Getting won't be protected in this example."""
            return self._value
        
        def setValue(self, val:int) -> None:
            """This will be made thread-safe by member lock."""
            with self._lock:
                self._value = val
            
    v1 = Version1()
    t1_1 = threading.Thread(target=v1.setValue, args=(1)).start()
    t1_2 = threading.Thread(target=v1.setValue, args=(2)).start()
    
    
    class Version2:
        def __init__(self):
            self._value:int = 0
        
        def getValue(self) -> int:
            """Getting won't be protected in this example."""
            return self._value
        
        def setValue(self, val:int, lock:threading.Lock) -> None:
            """This will be made thread-safe by injected lock."""
            with self._lock:
                self._value = val
            
    v2 = Version2()
    l = threading.Lock()
    t2_1 = threading.Thread(target=v2.setValue, args=(1, l)).start()
    t2_2 = threading.Thread(target=v2.setValue, args=(2, l)).start()
  1. In Version1, I, as the class provider, can guarantee that setting _value is always thread-safe...
  2. ...because in Version2, the user of my class might pass to different lock objects to the two spawned threads and thus render the lock protection useless.
  3. If I want to give the user of my class the freedom to include the setting of _value into a larger collection of steps that should be executed in a thread-safe manner, I could inject a Lock reference into Version1's __init__ function and assign that to the _lock member. Thus, the thread-safe operation of the class would be guaranteed while still allowing the user of the class to use "her own" lock for that purpose.

A score from 0-15 will now rate how well I have (mis)understood locks... :-D

CodePudding user response:

  1. It's also quite common to use global variables for locks. It depends on what the lock is protecting.
  2. True, although somewhat meaningless. Any function can use a lock, not just the function that's the target of a thread.
  3. If you mean there's no direct link between a lock and the data it protects, that's true. But you can define a data structure that contains a value that needs protecting and a reference to its lock.
  4. True. Although as I say in 3, you can define a data structure that packages the data and lock. You could make this a class and have the class methods automatically acquire the lock as needed.
  5. Correct. But see 4 for how you can automate this.
  6. Correct.
  7. Correct.
  8. Correct.
  9. Correct if it's not a global lock.
  10. Partially correct. You should also often acquire the lock if you're merely reading the variable. If reading the object is not atomic (e.g. it's a list and you're reading multiple elements, or you read the same scalar object variable times and expect it to be stable), you need to prevent another thread from modifying it while you're reading.
  11. Correct.
  12. Correct.
  13. Correct. This is an example of what I described above in 3 and 4.
  14. Correct. Which is why the design in 13 is often better.
  15. This is tricky, because the granularity of the locking needs to reflect all the objects that need to be protected. Your class only protects the assignment of that one variable -- it will release the lock before all the other steps associated with the caller-provided lock have been completed.
  • Related