Home > OS >  Avoid full table scan on a simple join using OR in MySQL
Avoid full table scan on a simple join using OR in MySQL

Time:10-22

I have this schema

create table table1
(
    id        int          auto_increment primary key,
    name      varchar(2)   null,
    position  int          null,
);

create index table1_position
    on table1 (position);


create table table_2
(
    id              int auto_increment primary key,
    table1_id       int          null,
    position        int          null,
    constraint table_2_ibfk_1
        foreign key (table1_id) references table1 (id)
);

create index ix_table_2_position
    on table_2 (position);

create index table1_id
    on table_2 (table1_id);

So I added two index on column position of each tables. Now I need to look for a range of position in BOTH table (by joining then and apply an OR query)

so I have this query

SELECT *
FROM table_1
    INNER JOIN table_2 ON table_1.id = table_2.table1_id
WHERE table_1.position BETWEEN 5000 AND 5500
   OR table_2.position BETWEEN 5000 AND 5500

But the Explain query output give me ALL (Full table scan)

id             1
select_type    SIMPLE
table          table_1
partitions
type           ALL
possible_keys  PRIMARY,table1_position
key
key_len
ref
rows           9929
filtered       100.0
Extra

If I change to an AND if give me the expected Range index scan

SELECT *
FROM table_1
    INNER JOIN table_2 ON table_1.id = table_2.table1_id
WHERE table_1.position BETWEEN 5000 AND 5500
   AND table_2.position BETWEEN 5000 AND 5500
id             1
select_type    SIMPLE
table          table_1
partitions
type           range
possible_keys  PRIMARY,pos_idx2
key            pos_idx2
key_len        5
ref
rows           1
filtered       100.0
Extra          Using index condition

But I need the OR statement here...How could I have mysql use a range scan index for an OR statement ? Could I improve my indexes here (I thought about a multi-values index on both position and table1_id -the foreign key, but it did'nt help and it performed a full table scan).

CodePudding user response:

SELECT *
FROM table_1
INNER JOIN table_2 ON table_1.id = table_2.table1_id
WHERE table_1.position BETWEEN 5000 AND 5500

UNION ALL

SELECT *
FROM table_1
INNER JOIN table_2 ON table_1.id = table_2.table1_id
WHERE table_1.position NOT BETWEEN 5000 AND 5500
  AND table_2.position BETWEEN 5000 AND 5500

Also test

SELECT *
FROM table_1
INNER JOIN table_2 ON table_1.id = table_2.table1_id
WHERE NOT (    table_1.position NOT BETWEEN 5000 AND 5500
           AND table_2.position NOT BETWEEN 5000 AND 5500 )

CodePudding user response:

Presumably you want to avoid the table scan for reasons of performance.

So try swapping your OR for a UNION operation.

First, get the set of table1.id values you need with a subquery, like so.

    SELECT id FROM table_1 WHERE position BETWEEN 5000 AND 5500
    UNION
    SELECT table1_id FROM table_2 WHERE position BETWEEN 5000 AND 5500

The second part of the UNION retrieves the table_1.id values you need from table_2, by SELECTing the FK column.

Next, use that subquery to get your rows from table1.

SELECT * FROM table_1
 WHERE id IN (
    SELECT id FROM table_1 WHERE position BETWEEN 5000 AND 5500
    UNION
    SELECT table1_id FROM table_2 WHERE position BETWEEN 5000 AND 5500           
)

To make this quicker, add this compound index on table_2.

CREATE INDEX ix_table_2_position_table1id 
          ON table_2 (position, table1_id);

Notice that neither of your two single-column indexes on table2 are useful for this query.

CodePudding user response:

Performance problems with OR can often be solved with UNION:

SELECT *
FROM table_1
INNER JOIN table_2 ON table_1.id = table_2.table1_id
WHERE table_1.position BETWEEN 5000 AND 5500
UNION
SELECT *
FROM table_1
INNER JOIN table_2 ON table_1.id = table_2.table1_id
WHERE table_2.position BETWEEN 5000 AND 5500;
  • Related