Home > Back-end >  PostgreSQL join and where clause execution order
PostgreSQL join and where clause execution order

Time:01-11

I have two tables that are bound with foreign key on PostgreSQL like below;

tableA (id, status) 
   - indexes;
      - (id)
      - (status)
tableB (id, tableAId, result)
   - indexes;
      - (id, tableAId)
      - (tableAId)

tableB is bound to tableA with foreign key by using tableAId field.

I want to join both tables by specific conditions (all filters based on indexes). There are two ways to apply filters, but I'd like to learn which way is more performant. Here is the ways;

- select * from tableA a join tableB b on a.id = b.tableAId where a.status = 'X' and b.id = 123
- select * from tableA a join tableB b on a.id = b.tableAId and a.status = 'X' and b.id = 123

As you can see, the first query contains filters in where section, but the second one contains them in on section of the join. As far as I know, join operations are done before where clause, so I expect the second query will have better performance compared to the first one. Am I correct?

CodePudding user response:

You can put an EXPLAIN before to see what the execution plan for a particular query is, which will explain how postgres will attempt to execute the query. You could even prepend the query with EXPLAIN ANALYZE to check the performance when the query is executed.

However, it's important to understand that the execution plan of the query is set up with the idea that the query should be correctly executed as quickly as possible. Accordingly there is no set in stone order in which the statements are applied, whether it will do the where or join first only depends on which of the two postgres thinks will be execute faster based on the statistics that it has. In fact, it might even mix it the two, first doing one of the where's then a join and then another where.

Lastly, the two queries you show are similar enough that a human can easily tell they are identical, so it shouldn't be a surprise that postgres can tell they are identical as well. Since the queries are identical postgres will internally rewrite the queries to the same query and then accordingly the make an identical execution plan for them.

  • Related