Home >
database > About SQL table joins the hash join, merge join, nested loop join three ways the optimizer to choose
About SQL table joins the hash join, merge join, nested loop join three ways the optimizer to choose
About SQL table joins the hash join, merge join, nested loop join of three methods for the optimizer to choose - personal understanding
Two tables, for example, A table and B table (all have id, name column) select * from A, B, where Anderson, d=b.i d;
1, the hash join hash join big table and a big table, a small table and large table
Divided into the stages: 1, build the hash table stage 2, detection of phase matching
A table id: B table id:
3 7
4 3
2 5
5 4
1 6
Simplify the understanding as follows, deep is not so simple
The first stage construction of hash table stage
Will be the id of the batch read in A table, calculate the hash value, form A hash table in memory, if A table is bigger, the table can be divided into shards, Oracle using the concept of bucket bucket to store different hash partitioning or call shard, partial read
The second phase detection phase, each row in A table B id, using the hash function is used, and will do it with A hash value in A table to detect match, into the fruit table is larger, the time and shards or bucket in A hash table do matching detection,
2, sort merge join sort merge join a sort operations, such as the order by, group by, distinct to sort the results have a
Commonly used for connection, such as natural connection
A table id: B table id:
3 7
4 3
2 5
5 4
1 6
- the first of two tables in the data sorting
A table id: B table id:
1 3
2 4
3 5
4 6
5 7
A table's id and B table matches the id of the order 1 (3), 2 (3) no, three (3)=3 matches, four (4)=5 (5)=match
Overhead generally record sum of A + B
If there is a materialized views, and indexes, the efficiency of the connection will also has a lot to improve, if the amount of data sorting is larger, can result in temporary temporary in pg system file cache, Oracle is used in the temporary temporary tablespace
3, the nested loop nested loop join
A table id: B table id:
3 7
4 3
2 5
5 4
1 6
Use each id in the table to match B table full table id (7,3,5,4,6)=3, 4, 3 (7,3,5,4,6)=4 2,3,5,4,6 (7)=no
Cost calculation cost=A * B table table record data record number
Applicable Yu Xiaobiao, materialized views or existence, on the table there is an index of the join