Home > Software engineering >  How to solve race condition when installing ruby gems and running scripts that use these gems?
How to solve race condition when installing ruby gems and running scripts that use these gems?

Time:11-03

We found interesting problem. Our environment is configured by using ansible, which in turn installs gems.

Some of the gems, we want version that is newer than something. For example, aws-sdk-core version >= 3.104.

This ansible tasks runs:

gem install -v '>= 3.104' aws-sdk-core

Then, we have a cronjob that every 5 minutes (but across couple of thousand servers) runs a script that does 'require aws-sdk-core'.

And, every so often, it breaks with:

/var/lib/gems/2.5.0/gems/aws-sdk-core-3.166.0/lib/seahorse.rb:3:in `require_relative': cannot load such file -- /var/lib/gems/2.5.0/gems/aws-sdk-core-3.166.0/lib/seahorse/util (LoadError)
...

I made trivial script that shows the problem on another, much smaller gem:

#!/usr/bin/env ruby
# frozen_string_literal: true
require 'progressbar'
puts 1

If you'll save it as z.rb, and then run in shell: while true; do ./z.rb; done, and then in another shell: while true; do gem install -v '>= 1.0.0' progressbar; done, eventually (after a minute or two) you will get, in the shell that runs z.rb:

1
1
<internal:/usr/lib/ruby/vendor_ruby/rubygems/core_ext/kernel_require.rb>:85:in `require': cannot load such file -- progressbar (LoadError)
        from <internal:/usr/lib/ruby/vendor_ruby/rubygems/core_ext/kernel_require.rb>:85:in `require'
        from ./z.rb:3:in `<main>'
1
1
1

Is there any way to avoid this problem, other than begin/rescue and retry after 1 second sleep (which I can do, but it's OH SO UGLY)?

The problem, for us, is that we need to install with at least some specific version (if we'd provide version = SOMETHING, ansible avoids calling gem install altogether, but we want new releases installed too), and while the window for race condition is small, with many thousand servers, and cronjob that runs every 5 minutes, (ansible runs every 4 hours), we get ~ dozen mails per day with cronjob fails.

CodePudding user response:

Not an answer maybe, but still... :) I think there is no better way than to synchronize the tasks somehow.

Rubygems installer seems to remove existing gem files before installing new ones (if the version to install exists I guess).

This is easy to confirm. For example, having 2 pry versions installed -- 0.14.0 and 0.14 1 - I do the following:

  1. Run while true; do gem install --no-document -f -v '0.14.1' pry; done
  2. In another shell run while true; do gem which pry; done

The result of 2 looks like:

...
/path/to/ruby/lib/ruby/gems/2.7.0/gems/pry-0.14.1/lib/pry.rb
/path/to/ruby/lib/ruby/gems/2.7.0/gems/pry-0.14.1/lib/pry.rb
/path/to/ruby/lib/ruby/gems/2.7.0/gems/pry-0.14.0/lib/pry.rb <=== Notice this one
/path/to/ruby/lib/ruby/gems/2.7.0/gems/pry-0.14.1/lib/pry.rb
/path/to/ruby/lib/ruby/gems/2.7.0/gems/pry-0.14.1/lib/pry.rb
...
Traceback (most recent call last):
        9: from /path/to/ruby/bin/gem:9:in `<main>'
        8: from /path/to/ruby/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:92:in `require'
        7: from /path/to/ruby/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:92:in `require'
        6: from /path/to/ruby/lib/ruby/2.7.0/rubygems/gem_runner.rb:86:in `<top (required)>'
        5: from /path/to/ruby/lib/ruby/2.7.0/rubygems.rb:1131:in `load_plugins'
        4: from /path/to/ruby/lib/ruby/2.7.0/rubygems.rb:538:in `find_latest_files'
        3: from /path/to/ruby/lib/ruby/2.7.0/rubygems/specification.rb:1086:in `latest_specs'
        2: from /path/to/ruby/lib/ruby/2.7.0/rubygems/specification.rb:1093:in `_latest_specs'
        1: from /path/to/ruby/lib/ruby/2.7.0/rubygems/specification.rb:1093:in `reverse_each'
/path/to/ruby/lib/ruby/2.7.0/rubygems/specification.rb:1096:in `block in _latest_specs': undefined method `platform' for nil:NilClass (NoMethodError)
...
/path/to/ruby/lib/ruby/gems/2.7.0/gems/pry-0.14.1/lib/pry.rb
/path/to/ruby/lib/ruby/gems/2.7.0/gems/pry-0.14.1/lib/pry.rb
...
ERROR:  While executing gem ... (Errno::ENOENT)
    No such file or directory @ rb_sysopen - /path/to/ruby/lib/ruby/gems/2.7.0/specifications/pry-0.14.1.gemspec
...

so pretty much as one would expect - all sorts of surprises (from resolving the gem to a previous version to several exceptions happening because we caught the gem installation process in the intermediate incomplete state).

Besides retrying that you mentioned one could also do smth like:

  • maintain some external lock that would be raised when gem installation starts and released when it is finished; the corn jobs check it and shut down gracefully if the installation is in progress (seems unnecessarily complicated for the case though, but still)
  • check the gem state from the cron job using Gem API, and shut down if the required gem doesn't exist
  • rescue the LoadError and shut down gracefully
  • etc etc etc.

CodePudding user response:

While talking with others we figured a solution. It seems that adding these two options: --conservative --minimal-deps to gem install solves the problem.

  • Related