What could be causing ruby NoMethodError backtrace to be so slow?-CodePudding

I have a pretty large ruby (non-rails) application that I'm developing. It's reasonably fast considering how large and complex it is (go ruby!), but sometimes I fat finger a method name and get the NoMethodError.

And usually when this happens, the application hangs for like 20 to 30 seconds to just print out the backtrace.

Specifically, if I do something like this:

puts "about to crash!"
Array.new().inspekt    # NoMethodError here

I see the "about to crash!" right away, and then 20s or so nothing seems to happen before I finally get the NoMethodError and backtrace.

At first I thought it might be the "did you mean" gem, so I turned that off with --disable-did_you_mean on the command line, and that turned off the "did you mean" suggestions, but nothing sped up the backtrace.

What's interesting is that this is only for NoMethodError.

If I cause some other exception, such as:

puts "about to crash!"
a = 3/0

Then I see the backtrace immediately.

And to make things even weirder, if I interrupt the process right after the "about to crash!" (such as with a ctrl-c on unix) then I immediately get the NoMethodError and it's backtrace. So it has the information - but ruby is stuck on trying to clean something up perhaps, something that only gets cleaned up on NoMethodError?

Info: ruby 2.7.0

OS: CentOS Linux release 7.5.1804

UPDATE - to responses so far: Everyone seems to be concerned about the backtrace and profiling the ruby code.

Except the slowdown is NOT happening there. There are NO LINES OF RUBY CODE that are executed during the slowdown. All of the lines prior to this, "in the backtrace" are already executed and in a matter of a second or so. Then the system hangs, between the puts and the NoMethodError. There is no ruby code in between to profile, so any profiler that is looking at code written in my ruby script isn't going to help. The slowdown is something internal to ruby and is not in my code, unless I'm terribly confused about what's happening.

To be very clear:

Line 10042:    puts "HERE"                  # Happens at ~1s
Line 10043:    Array.new().inspekt          # Happens at ~20-30s

There is no code between those lines to profile. The 20-30s is not happening in any code before line 10042 executes, so profiling that will not help.

I do have other Fibers that are paused. But there is no code here that yields to them. Is it possible that there's some strange built-in yield code that attempts to run other (paused) fibers when an exception is hit? I can't think of a reason you'd ever want this behavior, and many reasons why it would be catastrophic, but I can't think of anything else that would cause this problem (that is also killable with a ctrl-c!)

CodePudding user response：

I would try to debug the full backtrace in there to see what is actually happening

begin
  puts "about to crash!"
  Array.new().inspekt
rescue => e
  puts e.backtrace
  raise # raise anyway
end

In my case I get 20 lines of backtrace with ruby 2.6.3 and irb, if that doesn't really tell you anything interesting I would then do the tedious work of measuring each runtime by modifying each file of the backtrace and printing the times at each step, debugging yay!

CodePudding user response：

Hard to give a direct answer. You have to run some profiling on the entire process. These are obviously simplistic examples, I sure hope that's all you need here.

rbspy

If you have a slow running process it is easy to identify

# somewhere I have an `inspekt` method with `sleep 5`
#   def inspekt
#     sleep 5
#     raise NoMethodError
#   end

$ rbspy record -- bin/rails runner "User.all.inspekt"

Time since start: 7s. Press Ctrl C to stop.
Summary of profiling data so far:
% self  % total  name
 77.27    77.27  sleep [c function] - (unknown)            # <= looks slow
  3.31    98.42  require [c function] - (unknown)
  2.45     2.45  load_from_binary [c function] - (unknown)
  1.73     7.05  require_relative [c function] - (unknown)
  1.73     3.74  class_eval [c function] - (unknown)
  1.44     1.44  pop [c function] - (unknown)
  1.01     1.29  module_eval [c function] - (unknown)
  0.58     3.17  fetch [c function] - (unknown)
  0.43    77.84  eval [c function] - (unknown)
  0.43     0.58  wait_next_action_ - /home/alex/.rbenv/versions/3.1.1/lib/ruby/gems/3.1.0/gems/debug-1.5.0/lib/debug/thread_client.rb
  0.29     0.43  event! - /home/alex/.rbenv/versions/3.1.1/lib/ruby/gems/3.1.0/gems/debug-1.5.0/lib/debug/thread_client.rb
  0.29     0.29  push [c function] - (unknown)
  0.29     0.29  hash [c function] - (unknown)
  0.29     0.29  enable - <internal:trace_point>
  0.29     0.29  attr_reader [c function] - (unknown)
  0.29     0.29  alias_method [c function] - (unknown)
  0.14   NoMethodError

          raise NoMethodError
          ^^^^^
 16.55  each [c function] - (unknown)
  0.14     6.62  require - /home/alex/.rbenv/versions/3.1.1/lib/ruby/gems/3.1.0/gems/zeitwerk-2.5.4/lib/zeitwerk/kernel.rb
  0.14     5.90  block in run_initializers - /home/alex/.rbenv/versions/3.1.1/lib/ruby/gems/3.1.0/gems/railties-7.0.2.3/lib/rails/initializable.rb
  0.14     1.15  block in initialize - /home/alex/.rbenv/versions/3.1.1/lib/ruby/gems/3.1.0/gems/debug-1.5.0/lib/debug/session.rb
Wrote raw data to /home/alex/.cache/rbspy/2022-04-25-vrm2gDQEhj.raw.gz
Wrote formatted output to /home/alex/.cache/rbspy/2022-04-25-XzqwdFfr7C.flamegraph.svg

Look at generated flamegraph.svg to see the where the slow method is located.

ruby-prof

If you don't have a slow method and have something that gets called a lot

?> result = RubyProf.profile do
?>   10000.times { User.all }
>> end
>> RubyProf::GraphPrinter.new(result).print(STDOUT, {min_percent: 90})

Measure Mode: wall_time
Thread ID: 3360                               
Fiber ID: 28380                               
Total Time: 0.3134934608824551
Sort by: total_time

  %total   %self      total       self       wait      child            calls     name                          location
------------------------------------------------------------------------------------------------------------------------------------------------------
 100.00%   0.02%      0.313      0.000      0.000      0.313                1     [global]#                      (irb):2
                      0.313      0.006      0.000      0.307              1/1     Integer#times
------------------------------------------------------------------------------------------------------------------------------------------------------
                      0.313      0.006      0.000      0.307              1/1     [global]#
  99.98%   1.95%      0.313      0.006      0.000      0.307                1     Integer#times                  
                      0.307      0.016      0.000      0.291      10000/10000     ActiveRecord::Scoping::Named::ClassMethods#all
------------------------------------------------------------------------------------------------------------------------------------------------------
                                                                  # ^ that's a lot of calls

memory_profiler

ruby-memory-profiler bin/rails runner "User.all.inspekt" 

# you get a lot of output, just see if any gems or files or classes stick out.

If nothing obvious shows up, you could call inspekt multiple times until something shows up.

There are also lot of different measurements you could specify with all these profilers and different reports.

Take a look at where the interrupt happens when you press Ctrl-C

$ bin/rails runner "User.all.inspekt"                                                                                                                                              1 ↵
^C/home/alex/.rbenv/versions/3.1.1/lib/ruby/gems/3.1.0/gems/activerecord-7.0.2.3/lib/active_record/scoping/named.rb:23:in `sleep': Interrupt
    from /home/alex/.rbenv/versions/3.1.1/lib/ruby/gems/3.1.0/gems/activerecord-7.0.2.3/lib/active_record/scoping/named.rb:23:in `inspekt'

scoping/named.rb:23: <= yeah, that's where I put the sleep method.