I know that find_each
has been designed to consume smaller memory than each
.
I found some code that other people wrote long ago. and I think that it's wrong.
Think about this codes.
users = User.where(:active => false) # What this line does actually? Nothing?
users.find_each do |user|
# update or do something..
user.update(:do_something => "yes")
end
in this case, It will store all user objects to the users
variable. so we already populated the full amount of memory space. There is no point using find_each later on.
Am I correct?
so in other words, If you want to use find_each
, you always need to use it with ActiveRecord::Relation object. Like this.
User.where(:active => false).find_each do |user|
# do something...
end
What do you think guys?
Update
in users = User.where(:active => false)
line,
Some developer insists that rails never execute query unless we don't do anything with that variable. What if we have a class with initialize method that has query?
class Test
def initialize
@users = User.where(:active => true)
end
def do_something
@user.find_each do |user|
# do something really..
end
end
end
If we call Test.new
, what would happen? Nothing will happen?
CodePudding user response:
users = User.where(:active => false)
doesn't run a query against the database and it doesn't return an array with all inactive users. Instead, where
returns an ActiveRecord::Relation
. Such a relation basically describes a database query that hasn't run yet. The defined query is only run against the database when the actual records are needed. This happens for example when you run one of the following methods on that relation: find
, to_a
, count
, each
, and many others.
That means the change you did isn't a huge improvement, because it doesn't change went and how the database is queried.
But IMHO that your code is still slightly better because when you do not plan to reuse the relation then why assign it to a variable in the first place.
CodePudding user response:
users = User.where(:active => false)
users.find_each do |user|
User.where(:active => false).find_each do |user|
Those do the same thing.
The only difference is the first one stores the ActiveRecord::Relation object in users
before calling #find_each
on it.
This isn't a Rails thing, it applies to all of Ruby. It's method chaining common to most object-oriented languages.
array = Call.some_method
array.map { |item| do_something(item) }
Call.some_method.map { |item| do_something(item) }
Again, same thing. The only difference is in the first the intermediate array
will persist, whereas in the second the array will be built and then eventually deallocated.
If we call Test.new, what would happen? Nothing will happen?
Exactly. Rails will make an ActiveRecord::Relation and it will defer actually contacting the database until you actually do a query.
This lets you chain queries together.
@users = User.where(active: false).order(name: :desc)
Later you can add more constraints.
@users.where(favorite_color: :green).find_each do |user|
...
end
No query is made until find_each
is called.
find_each
is special in that it works in batches to avoid consuming too much memory on large tables.
A common mistake is to write this:
User.where(:active => false).each do |user|
Or worse:
User.all.each do |user|
Calling each
on an ActiveRecord::Relation will pull all the results into memory before iterating. This is bad for large tables.
find_each
will load the results in batches of 1000 to avoid using too much memory. It hides this batching from you.
There are other methods which work in batches, see ActiveRecord::Batches.
For more see the Rails Style Guide and use rubocop-rails to scan your code for issues and make suggestions and corrections.