Home > front end >  Trying to speed up finding an index in active record query
Trying to speed up finding an index in active record query

Time:10-08

I have a page where I start with a list of check_ins or client visits and then I need to loop through each visit (i) to find all that client's other visits to determine the visits neighbor (i - 1). I need to pull information from that visits neighbor as well. This is taking a very long time and I thought there might be a speedier way to perform the query.

Assuming I start with the list of visits or check_ins:

check_ins.each_with_index do |ci, i|

  # access the client
  client = ci.client 

  # pull other relevent client visits
  client_cis = client.check_ins
    .visible
    .includes(:weigh_in)
    .select(:id, :week, :created_at, :type_of_weighin)
    .where.not(check_ins: {type_of_weighin: nil})
    .where.not(weigh_ins: {id: nil})
    .where("weigh_ins.date >= ?", ci.created_at - 3.weeks) 

  # access the visit's index
  i = client_cis.index(ci)

  # access the visit's neighbor
  neighbor = client_cis[ i - 1 ]

  # perform downstream calculations using visit and it's neighbor
  ...

end

Is there a faster way to perform this type of analysis?

CodePudding user response:

You could try this although I make no performance improvement guarantees the following should result in the same functional mechanism

grouped_check_ins = check_ins.group_by(&:client_id)
weigh_ins_table = WeighIn.arel_table
check_ins_table = CheckIn.arel_table 
weigh_ins_join = Arel::Nodes::InnerJoin.new(
  weigh_ins_table,weigh_ins_table.create_on(
   weigh_ins_table[:check_in_id].eq(check_ins_table[:id]).and(
     weigh_ins_table[:date].gteq(
       check_ins_table[:created_at] - Arel.sql("interval '3 week'")
     ) 
    )
  )
)
Client
  .joins(:check_ins)
  .joins(weigh_ins_join)
  .eager_load(check_ins: :weigh_in)
  .where(id: grouped_check_ins.keys) # alternately .where(id: check_ins.select(:client_id))
  .where.not(check_ins: {type_of_weighin: nil})
  .each do |client|
    client_check_ins_with_neighbor = client.check_ins.each_cons(2)
       .select do |_,b| 
         grouped_check_ins[client.id].any? {|c| c.id == b.id }
       end 
    client_check_ins_with_neighbor.each do |neighbor,check_in| 
       # perform downstream calculations using visit and it's neighbor
    end  
  end 

UPDATE to explain what this code does per the OP's request:

  • grouped_check_ins = check_ins.group_by(&:client_id) - group your CheckIn objects by the client_id for faster lookup. This results in a Hash of {client_id => [CheckIns]}

  • Arel work - This is so we can build a join with multiple conditions. In this case the conditions are "weigh_ins.check_in_id = check_ins.id AND weigh_ins.date >= check_ins.created_at - interval '3 week'"

weigh_ins_table = WeighIn.arel_table
check_ins_table = CheckIn.arel_table 
weigh_ins_join = Arel::Nodes::InnerJoin.new(
  weigh_ins_table,weigh_ins_table.create_on(
   weigh_ins_table[:check_in_id].eq(check_ins_table[:id]).and(
     weigh_ins_table[:date].gteq(
       check_ins_table[:created_at] - Arel.sql("interval '3 week'")
     ) 
    )
  )
)
  • Build the Query
Client
  .joins(:check_ins)
  .joins(weigh_ins_join)
  .eager_load(check_ins: :weigh_in)
  .where(id: grouped_check_ins.keys) # alternately .where(id: check_ins.select(:client_id))
  .where.not(check_ins: {type_of_weighin: nil})

This results in SQL akin to

SELECT 
/* every column from each table listed 
with names generated by rails so that it can 
perform eager loading appropriately */ 
FROM 
  clients
  INNER JOIN check_ins ON check_ins.client_id = clients.id 
  INNER JOIN weigh_ins ON weigh_ins.check_in_id = check_ins.id 
    AND weigh_ins.date >= check_ins.created_at - interval '3 week'
WHERE 
  clients.id IN (/* list of client_ids from the group_by */) 
AND check_ins.type_of_weighin IS NOT NULL 
  • Iterate over the clients check ins in consecutive order, 2 at a time. Select each pair where the second item is in the list of client check ins originally provided.
client_check_ins_with_neighbor = client.check_ins.each_cons(2)
  .select do |_,b| 
    grouped_check_ins[client.id].any? {|c| c.id == b.id }
end
  • Then we just iterate over those selected pairs to perform whatever work needs to be done
client_check_ins_with_neighbor.each do |neighbor,check_in| 
  # perform downstream calculations using visit and it's neighbor
end
  • Related