Home > Mobile >  Is access to ruby Array thread-safe?
Is access to ruby Array thread-safe?

Time:12-07

Say that I have N threads accessing an array with N elements. The array has been prepared before the threads start. Each thread will access a different element (the thread I will access element I, both for reading and writing).

In theory, I'd expect such an access pattern not to cause any race conditions, but will Ruby actually guarantee thread safety in this case?

CodePudding user response:

but will Ruby actually guarantee thread safety in this case

Ruby does not have a defined memory model, so there are no guarantees of any kind.

YARV has a Giant VM Lock which prevents multiple Ruby threads from running at the same time, which gives some implicit guarantees, but this is a private, internal implementation detail of YARV. For example, TruffleRuby, JRuby, and Rubinius can run multiple Ruby threads in parallel.

Since there is no specification of what the behavior should be, any Ruby implementation is free to do whatever they want. Most commonly, Ruby implementors try to mimic the behavior of YARV, but even that is not well-defined. In YARV, data structures are generally not thread-safe, so if you want to mimic the behavior of YARV, do you make all your data structures not thread-safe? But in YARV, also multiple threads cannot run at the same time, so in a lot of cases, operations are implicitly thread-safe, so if you want to mimic YARV, should you make your data structures thread-safe?

Or, in order to mimic YARV, should you prevent multiple threads from running at the same time? But, being able to run multiple threads in parallel is actually one of the reasons why people choose, for example JRuby over YARV.

As you can see, this is very much not a trivial question.

The best solution is to verify the behavior of each Ruby implementation separately. Actually, that is the second best solution.

The best solution is to use something like the concurrent-ruby Gem where someone else has already done the work of verifying the behavior of each Ruby implementation for you. The concurrent-ruby maintainers have a close relationship with several Ruby implementations (Chris Seaton, one of the two lead maintainers of concurrent-ruby is also the lead developer of TruffleRuby, a JRuby core developer, and a member of ruby-core, for example), and so you can generally be certain that everything that is in concurrent-ruby is safe on all supported Ruby implementations (currently YARV, JRuby, and TruffleRuby).

Concurrent Ruby has a Concurrent::Array class which is thread-safe. You can see how it is implemented here: https://github.com/ruby-concurrency/concurrent-ruby/blob/master/lib/concurrent-ruby/concurrent/array.rb As you can see, for YARV, Concurrent::Array is actually the same as ::Array, but for other implementations, more work is required.

The concurrent-ruby developers are also working on specifying Ruby's memory model, so that in the future, both programmers know what to expect and what not to expect, and implementors know what they are allowed to optimize and what they aren't.

CodePudding user response:

Alternatives to Mutable Arrays

In standard Ruby implementations, an Array is not thread-safe. However, a Queue is. On the other hand, a Queue is not quite an Array, so you don't have all the methods on Queue that you may be looking for.

The Concurrent Ruby gem provides a thread-safe Array class, but as a rule thread-safe classes will be slower than those that aren't. Depending on your data this may not matter, but it's certainly a design consideration.

If you know from the beginning that you're going to be heavily reliant on threading, you should build your application on a Ruby implementation that offers concurrency and threading to begin with (e.g. consider JRuby or TruffleRuby), and design your application to take advantage of Ractors or use other concurrency models that treat data as immutable rather than sharing objects between threads.

Immutable data is a better pattern for threading than shared objects. You may or may not have problems with any given mutable object given enough due care, but Ractors and fiber-local variables should be faster and safer than trying to make mutable objects threat-safe. YMMV, though.

  • Related