Ruby work distribution fails if threads are generated to fast-CodePudding

I ran into a problem the other day and I spent 2 hours looking for an answer at the wrong place.

In the process I stripped down the code to the version below. The Threading here will work as long as I have the sleep(0.1) in the loop creating the threads.

If the line is omitted, all threads are created - but only thread 7 will actually consume data from the queue.

With this "hack" I do have a working solution but not one I'm happy with. I'm really curious why this happens.

I am using a fairly old version of ruby under windows 2.4.1p111. However I was able to reproduce the same behavior with a new ruby 3.0.2p107 installation

#!/usr/bin/env ruby

@q = Queue.new
      
# Get all projects (would be a list of directories)
projects = [*0..100]
projects.each do |project|
  @q.push project
end

def worker(num)
  while not @q.empty?
    puts "Thread: #{num} Project: #{@q.pop}"
    sleep(0.5)
  end
end 


threads=[]
for i in 1..7 do
  threads << Thread.new { worker(i) }
  sleep(0.1) # Threading does not work without this line - but why?
end

threads.each {|thread| puts thread.join }

puts "done"

CodePudding user response：

Fun bug! This is a race condition.

It's not that only thread 7 is doing work it's that all threads are referencing the same variable i in memory (there is only one copy!) so since the number 7 gets written last (presumedly before any threads have started) they all read the same i==7.

Try this worker function and see if it doesn't clear things up

def worker(num)
  my_thread_id = Thread.current.object_id

  while not @q.empty?
    puts "Thread: #{num} NumObjId: #{num.object_id} ThreadId: #{my_thread_id} Project: #{@q.pop}"
    sleep(0.5)
  end
end

Notice that NumObjId is the same in all threads. They are all pointing to the same number. But the actualy ThreadId we get IS different.