Home > database >  How can I keep the Tempfile contents from being empty in a separate (Ruby) thread?
How can I keep the Tempfile contents from being empty in a separate (Ruby) thread?

Time:10-14

In a Rails 6.x app, I have a controller method which backgrounds queries that take longer than 2 minutes (to avoid a browser timeout), advises the user, stores the results, and sends a link that can retrieve them to generate a live page (with Highcharts charts). This works fine.

Now, I'm trying to implement the same logic with a method that backgrounds the creation of a report, via a Tempfile, and attaches the contents to an email, if the query runs too long. This code works just fine if the 2-minute timeout is NOT reached, but the Tempfile is empty at the commented line if the timeout IS reached.

I've tried wrapping the second part in another thread, and wrapping the internals of each thread with a mutex, but this is all getting above my head. I haven't done a lot of multithreading, and every time I do, I feel like I stumble around till I get it. This time, I can't even seem to stumble into it.

I don't know if the problem is with my thread(s), or a race condition with the Tempfile object. I've had trouble using Tempfiles before, because they seem to disappear quicker than I can close them. Is this one getting cleaned up before it can be sent? The file handle actually still exists on the file system at the commented point, even though it's empty, so I'm not clear on what's happening.

def report
  
  queue = Queue.new
  file = Tempfile.new('report')

  thr = Thread.new do
    query = %Q(blah blah blah)
    @calibrations = ActiveRecord::Base.connection.exec_query query
    query = %Q(blah blah blah)
    @tunings = ActiveRecord::Base.connection.exec_query query
    if queue.empty?
      unless @tunings.empty?
        CSV.open(file.path, 'wb') do |csv|
          csv << ["headers...", @parameters].flatten
          @calibrations.each do |c|
            line = [c["h1"], c["h2"], c["h3"], c["h4"], c["h5"], c["h6"], c["h7"], c["h8"]]
            t = @tunings.select { |t| t["code"] == c["code"] }.first
            @parameters.each do |parameter|
              line << t[parameter.downcase]
            end
            csv << line
          end
        end
        send_data file.read, :type => 'text/csv; charset=iso-8859-1; header=present', :disposition => "attachment; filename=\"report.csv\""
      end
    else
      # When "timed out", `file` is empty here
      NotificationMailer.report_ready(current_user, file.read).deliver_later
    end
  end

  give_up_at = Time.now   120.seconds
  while Time.now < give_up_at do
    if !thr.alive?
      break
    end
    sleep 1
  end
  if thr.alive?
    queue << "Timeout"
    render html: "Your report is taking longer than 2 minutes to generate. To avoid a browser timeout, it will finish in the background, and the report will be sent to you in email."
  end

end

CodePudding user response:

The reason the file is empty is because you are giving the query 120 seconds to complete. If after 120 seconds that has not happened you add "Timeout" to the queue. The query is still running inside the thread and has not reached the point where you check if the queue is empty or not. When the query does complete, since the queue is now not empty, you skip the part where you write the csv file and go to the Notification.report line. At that point the file is still empty because you never wrote anything into it.

In the end I think you need to rethink the overall logic of what you are trying to accomplish and there needs to be more communication between the threads and the top level.

Each thread needs to tell the top level if it has already sent the result, and the top level needs to let the thread know that its past time to directly send the result, and instead should email the result.

Here is some code that I think / hope will give some insight into how to approach this problem.

timeout_limit = 10
query_times = [5, 15, 1, 15]
timeout = []
sent_response = []
send_via_email = []

puts "time out is set to #{timeout_limit} seconds"

query_times.each_with_index do |query_time, query_id|
  puts "starting query #{query_id} that will take #{query_time} seconds"
  timeout[query_id] = false
  sent_response[query_id] = false
  send_via_email[query_id] = false

  Thread.new do
    ## do query
    sleep query_time
    unless timeout[query_id]
      puts "query #{query_id} has completed, displaying results now"
      sent_response[query_id] = true
    else
      puts "query #{query_id} has completed, emailing result now"
      send_via_email[query_id] = true
    end
  end

  give_up_at = Time.now   timeout_limit
  while Time.now < give_up_at
    break if sent_response[query_id]
    sleep 1
  end
  unless sent_response[query_id]
    puts "query #{query_id} timed out, we will email the result of your query when it is completed"
    timeout[query_id] = true
  end
end

# simulate server environment
loop { }

=>

time out is set to 10 seconds
starting query 0 that will take 5 seconds
query 0 has completed, displaying results now
starting query 1 that will take 15 seconds
query 1 timed out, we will email the result of your query when it is completed
starting query 2 that will take 1 seconds
query 2 has completed, displaying results now
starting query 3 that will take 15 seconds
query 1 has completed, emailing result now
query 3 timed out, we will email the result of your query when it is completed
query 3 has completed, emailing result now
  • Related