Home > Software design >  SQL: how to speed up a conditional join?
SQL: how to speed up a conditional join?

Time:12-16

I'm doing a conditional join where I'm joining to one table if the condition is met, but to another table if not. This is painfully slow; is there a way to optimize the query to speed it up?

The tables look like this (simplified):

Subscriptions:
user_ID, subscription_ID, status
96, 254, Active
77, 777, Expired

Users:
user_ID
18
77
99

Seats:
user_ID, subscription_ID
18, 254
18, 222
22, 333
22, 444

The task is: given a user_ID from Users, try to get the subscription status from Subscriptions; but, if that user_ID isn’t in Subscriptions, look up all its possible subscription numbers in Seats and try finding those in Subscriptions instead. If no subscription is found, keep the user_ID in the results with Nulls for the other columns. For example, user_ID 18 isn't in Subscriptions, but I can still get its status by seeing that it has subscription 254 in Seats and looking that up instead.

Desired result:

user_ID, subscription_ID, status
18, 254, Active
77, 777, Expired
99, Null, Null

I can't just look up all the userIDs in Seats and get around the conditional join that way, because most of them aren't in that table. I can't change the table schemas.

I think this query does what I want:

select * from users
left join seats on users.user_ID=seats.user_ID
left join subscriptions on 
(case when seats.subscription_ID is null then users.user_ID=facts.user_ID
else seats.subscription_ID=subscriptions.subscription_ID end)

But these tables are large and it's taking unfeasibly long to complete. What would be a better way of doing this?

CodePudding user response:

This is the method suggested by ysth with the two joins to Subscriptions and the use of coalesce in the select list -

select
    u.user_ID,
    coalesce(subs1.subscription_ID, subs2.subscription_ID) subscription_ID,
    coalesce(subs1.status, subs2.status) status
from Users u
left join Subscriptions subs1
    on u.user_ID = subs1.user_ID
left join seats
    on subs1.user_ID is null and u.user_ID = seats.user_ID
left join Subscriptions subs2
    on seats.subscription_ID = subs2.subscription_ID
where not(seats.user_ID is not null and subs2.subscription_ID is null);

If this still takes "unfeasibly long to complete", please add your CREATE TABLE statements and the EXPLAIN output for the query to your question.

CodePudding user response:

join twice and union all then together, if you know they will not pull the rows in twice (on both sides) and if they would, add the anti match.

select * from jag
left join seats 
    on jag.userID = seats.userID
left join facts 
    on seats.sub is null and  jag.userID = facts.userID

union all

select * from jag
left join seats 
    on jag.userID=seats.userID
left join facts 
    on seats.sub is not null and seats.sub = facts.sub
  • Related