I'm stuck with a classic greatest-n-per-group problem, where a cat can have many kittens, but I'm usually just interested in the youngest.
I already do know how to build a scope
and a has_one
relation for the Cat
.
My question: Is there a way to...
- list all cats' names together with their youngest kittens' names...
- while at the same time ordering them by their respective youngest kitten's name...
...using just a single SELECT under the hood?
What I got so far:
class Cat < ApplicationRecord
has_many :kittens
has_one :youngest_kitten, -> { merge(Kitten.youngest) }, foreign_key: :cat_id, class_name: :Kitten
scope :with_youngest_kittens, lambda {
joins(:kittens)
.joins(Kitten.younger_kittens_sql("cats.id"))
.where(younger_kittens: { id: nil })
}
end
class Kitten
belongs_to :cat
scope :youngest, lambda {
joins(Kitten.younger_kittens_sql("kittens.cat_id"))
.where(younger_kittens: { id: nil })
}
def self.younger_kittens_sql(cat_field_name)
%{
LEFT OUTER JOIN kittens AS younger_kittens
ON younger_kittens.cat_id = #{cat_field_name}
AND younger_kittens.created_at > kittens.created_at
}
end
end
When I run Cat.with_latest_kittens.order('kittens.name').map(&:name)
everything looks fine: I get all the cats' names with just a single SELECT.
But when I run Cat.with_latest_kittens.order('kittens.name').map {|cat| cat.youngest_kitten.name}
, I get the right result too, but a superfluous additional SELECT per cat is executed. Which is just logical, because the with_youngest_kittens
doesn't know it should populate youngest_kitten
. Is there a way to tell it or am I going about this all wrong?
CodePudding user response:
I think adding an includes to your :with_youngest_kittens
scope will fix the problem. Try changing the scope to
scope :with_youngest_kittens, lambda {
includes(:youngest_kitten)
.joins(:kittens)
.joins(Kitten.younger_kittens_sql("cats.id"))
.where(younger_kittens: { id: nil })
}
This should prevent Rails from making a separate database query for every kitten.
CodePudding user response:
I found a solution that produces no extra SELECT, however it is quite ugly, so I'll actually go for localarrow's solution as it's more readable!
I thought I'd still post it for the sake of completeness (If someone needs the few ms extra performance):
First I add custom tailored select fields for each kitten column to the Cat.with_youngest_kitten
scope:
scope :with_youngest_kittens, lambda {
kitten_columns = Kitten
.column_names
.map { |column_name| "kittens.#{column_name} AS `youngest_kittens.#{column_name}`" }
.join(', ')
joins(:kittens)
.joins(Kitten.latest_outer_join_sql("cats.id"))
.where(later_kittens: { id: nil })
.select("cats.*, #{kitten_columns}")
}
Then I override the has_one youngest_kitten
relation with a method, that retrieves those custom selects and calls super if no data has been retrieved:
def youngest_kitten
return super if self[:'youngest_kittens.id'].nil?
kitten_hash = Hash[Kitten.column_names.collect { |column_name| [column_name, self[:"youngest_kittens.#{column_name}"]] }]
kitten_hash[:cat] = self
Kitten.new(kitten_hash)
end