I need to develop a search engine. All data is saved in the Oracle database which contains approx 300GB of data. I need to join multiple tables to gather all the info I need and search from there. It went super slow without indexing(always time out) when I joined 4 tables together. Here is an example of my query:
select distinct t1.a, t1.b, t2.c , t1.d, t3.e, t5.f, t1.g, t4.k
from table1 t1
inner join table2 t2 on t2.a = t1.a
inner join table3 t3 on t3.l = t2.l
left outer join table4 t4 on t4.d = t1.d
left outer join table5 t5 on t5.a = t1.a
where t1.a > 0 and t1.b < 2 and t2.c > 3 and t1.d > 4
and t3.e > 5 and t5.f > 6;
I know there is a thing called index that could speed up the runtime. The question is how could I make the index?
Should I make indexes for each join key in each table?
Also for the where statement, how could I make a cross table index? For example, column a and column c is from different tables, how could I make an index to have them both?
Additional question: Is there any way to make the select distinct faster? I tried the group by solution and it did not help too much. examples: A query will take 0.07s:
select distinct t1.a, t1.b, t2.c , t1.d, t1.g
from table1 t1
inner join table2 t2 on t2.a = t1.a
where rownum < 100;
but with distinct it will be timeout.
CodePudding user response:
You definitely want indexes on all columns used for join conditions in the table being joined to, but you should add the range condition after the key (where used):
create index table2_ac on table2(a, c);
create index table3_le on table3(l, e);
create index table4_d on table4(d);
create index table5_af on table5(a, f);
You can't make "cross table" indexes.
Creating an index over the columns used in the where
clause may help:
create index table1_abd on table1(a, b, d);
Finally, move conditions on joined tables from the where
clause into the join condition:
select distinct t1.a, t1.b, t2.c , t1.d, t3.e, t5.f, t1.g, t4.k
from table1 t1
inner join table2 t2 on t2.a = t1.a and t2.c > 3
inner join table3 t3 on t3.l = t2.l and t3.e > 5
left outer join table4 t4 on t4.d = t1.d
left outer join table5 t5 on t5.a = t1.a and t5.f > 6
where t1.a > 0 and t1.b < 2 and t1.d > 4
Which also fixes the subtle problem with your query that by putting conditions on left joined tables in the where clause changes that join type from outer to inner.