I have an assignment where I have 2 numbers (a and b) and a count (n). I need to first generate random numbers that will be the time each thread will sleep for. Then, each thread calculates the sum, difference, product and division (respectively) of a and b and repeat that n times. For example:
a = 10
b = 2
n = 2
SUM: 12
DIFFERENCE: 8
PRODUCT: 20
DIVISION: 5
SUM: 12
DIFFERENCE: 8
PRODUCT: 20
DIVISION: 5
and in between each line, the program sleeps for some seconds. The order must be sum, difference, product and division. I can't use queues or kill/spawn threads repeatedly. My first thought was to use 4 conditional variables to dictate the order the threads need to run. I came up with this:
mutex = Mutex.new
resources = Array.new(4) { ConditionVariable.new }
t1 = Thread.new do
mutex.synchronize do
puts "Thread 1"
sleep(3)
resources[0].signal
end
end
t2 = Thread.new do
mutex.synchronize do
resources[0].wait(mutex)
puts "Thread 2"
sleep(3)
resources[1].signal
end
end
t3 = Thread.new do
mutex.synchronize do
resources[1].wait(mutex)
puts "Thread 3"
sleep(3)
resources[2].signal
end
end
t4 = Thread.new do
mutex.synchronize do
resources[2].wait(mutex)
puts "Thread 4"
sleep(3)
end
end
t1.join
t2.join
t3.join
t4.join
but I'm getting this deadlock error: main.rb:39:in 'join': No live threads left. Deadlock? (fatal)
Could this approach work? What should I do to fix it? Are there any other better approaches that could work?
CodePudding user response:
I'm not a ruby expert, but in every other language i have used, the name "condition variable" is a misnomer. For anything else that's called "variable," we expect that if one thread changes it, some other thread can come along later and see that it was changed. That is not how condition variables work.
When thread A "notifies/signals" a condition variable, it will "wake up" some other thread that already was waiting, but if no other thread happened waiting at that moment, then the signal/notification does absolutely nothing at all.
Condition variables do not remember notifications.
Here's what I think could happen:
The t1
thread locks the mutex, and then sleeps.
The other three threads all start up, and all get blocked while awaiting the mutex.
The t1
thread returns from sleep(3)
, and it signals the condition variable. But, condition variables do not remember notifications. None of the other threads has been able to get to their wait(mutex)
calls, because they're all still trying to get past mutex.synchronize
. The notification is lost.
The t1
thread leaves the synchronized block, the other threads get in to their synchronized blocks, one-by-one, until all of them are awaiting signals.
Meanwhile, the main thread has been hanging in t1.join()
. That call returns when the t1
thread ends, but then the main thread calls t2.join()
t2
is awaiting a signal, t3
is awaiting a signal, t4
is awaiting a signal, and the main thread is waiting for t2
to die.
No more live threads.
Again, Not a ruby expert, but in every other language, a thread that uses a condition variable to await some "condition" must do something like this:
# The mutex prevents other threads from modifying the "condition"
# (i.e., prevents them from modifying the `sharedData`.)
mutex.lock()
while ( sharedData.doesNotSatisfyTheCondition() ) {
# The `wait()` call _temporarily_ unlocks the mutex so that other
# threads may make the condition become true, but it's _guaranteed_
# to re-lock the mutex before it returns.
conditionVar.wait(mutex)
}
# At this point, the condition is _guaranteed_ to be true.
sharedData.doSomethingThatRequiresTheConditionToBeTrue()
mutex.unlock()
The most important thing going on here is, the caller does not wait if the condition already is true. If the condition already is true, then the notification probably already has happened. We missed it, and if we wait for it now, we may end up waiting forever.
The other important thing is, after we have awaited and received a notification, we check the condition again. Depending on the rules of the programming language, on the operating system, and on the architecture of the program; it may be possible for wait()
to return prematurely.
Making the condition become true is simple:
mutex.lock()
sharedData.doSomethingThatMakesTheConditionTrue()
conditionVar.notify()
mutex.unlock()