Home > front end >  NOT IN query is too slow with large number of rows
NOT IN query is too slow with large number of rows

Time:03-06

My SQL query below has been fine until thirdparty_token_airdrops ended up with 200k rows and thirdparty_token_holders had 900k rows. At this scale, what would be the best way to optimise it?

select id, owner
from thirdparty_token_holders
where id not in (
    select holder_id
    from thirdparty_token_airdrops
    where status=1
)
and amount > 10000000000
limit 300

In terms of indexes, I've created the following:

CREATE UNIQUE INDEX holder_id on thirdparty_token_airdrops (id, holder_id);

CREATE UNIQUE INDEX owner_address on thirdparty_token_holders (owner, address);

CREATE UNIQUE INDEX owner_id on thirdparty_token_holders (owner, id);

CodePudding user response:

I fixed it by changing the query to this:

SELECT a.id, a.owner
FROM thirdparty_token_holders a
LEFT JOIN thirdparty_token_airdrops b
ON a.id = b.holder_id
AND b.status IS NULL
and amount > 10000000000
limit 300

CodePudding user response:

You can try out with NOT EXISTS. It's better than LEFT jOIN too. Because left join fetches all the possible records for the left table whereas the not exists fetch only matched rows first.

SELECT a.id, a.owner
FROM thirdparty_token_holders AS a
WHERE NOT EXISTS 
(
    SELECT * FROM thirdparty_token_airdrops AS b
    WHERE a.id = b.holder_id
)
AND b.status IS NULL
and a.amount > 10000000000
limit 300
;
  • Related