Home > Mobile >  Rails batch convert one object to another object
Rails batch convert one object to another object

Time:10-15

I am looking for the most efficient way to converts a huge number of objects (1M instances) to another object type. Unfortunately I don't have the choice of what I am getting as an input (the million object).

So far I've tried with each_slice but it does not show much improvement!

It looks like this:

expected_objects_of_type_2 = []
huge_array.each_slice(3000) do |batch|
  batch.each do |object_type_1|
    expected_objects_of_type_2 << NewType2.new(object_type_1)
  end  
end

Any idea?

Thanks!

CodePudding user response:

I did a quick test with a few different methods of looping the array and measured the timings:

huge_array = Array.new(10000000){rand(1..1000)}
a = Time.now
string_array = huge_array.map{|x| x.to_s}
b = Time.now
puts b-a

Same with:

sa = []
huge_array.each do |x|
    sa << x.to_s
end

and

sa = []
huge_array.each_slice(3000) do |batch|
  batch.each do |x|
    sa << x.to_s
  end  
end 

No idea what you are converting so I did a bit of simple int to string.

Timings

Map: 1.7
Each: 2.3
Slice: 3.2

So apparently your slice overhead makes things slower. Map seems to be the fastest (which is internally just a for loop but with a non-dynamic length array as output). The << seems to slow things down a bit.

So if each object needs an individual converting you are stuck with O(n) complexity and can't speed things up by a lot. Just avaid overhead.

Depending on your data, sorting and exploiting caching effects might help or avoiding duplicates if you have a lot of identical data but we have no way to know if we don't know your actual conversions.

CodePudding user response:

I would treat each slice in its own thread:

huge_array.each_slice(3000) do |batch|
  Thread.new do 
    batch.each do |object_type_1|
      expected_objects_of_type_2 << NewType2.new(object_type_1)
    end  
  end
end

Then you have to wait for the threads to terminate using join. They should be accumulated in an array and joined.

  • Related