Home > other >  Huge divergence in query estimate vs actual time in postgres
Huge divergence in query estimate vs actual time in postgres

Time:01-06

Between two different environments with identical databases (local machine, and production on heroku) we are seeing a large difference in execution time for the same, fairly simple, query.

The query is:

SELECT "property_tax_bills".* FROM "property_tax_bills" INNER JOIN "property_tax_bill_parsed_addresses" ON "property_tax_bills"."id" = "property_tax_bill_parsed_addresses"."property_tax_bill_id" WHERE "property_tax_bill_parsed_addresses"."parsed_address_id" = 2 AND "property_tax_bills"."statement_date" = '2021-11-20';

property_tax_bills is a large-ish table (44 million records), and the join table property_tax_bill_parsed_addresses is equally large.

The EXPLAIN ANALYZE command run via psql locally returns the following:

 Gather  (cost=1105.30..64845.64 rows=219 width=147) (actual time=15.054..18.941 rows=101 loops=1)
   Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=25539
   ->  Nested Loop  (cost=105.30..63823.74 rows=91 width=147) (actual time=12.147..12.291 rows=34 loops=3)
         Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
         Inner Unique: true
         Buffers: shared hit=25539
         Worker 0:  actual time=10.752..10.898 rows=36 loops=1
           Buffers: shared hit=7647
         Worker 1:  actual time=11.014..11.162 rows=28 loops=1
           Buffers: shared hit=6483
         ->  Parallel Bitmap Heap Scan on public.property_tax_bill_parsed_addresses  (cost=104.73..32879.67 rows=3672 width=8) (actual time=0.634..3.758 rows=1442 loops=3)
               Output: property_tax_bill_parsed_addresses.id, property_tax_bill_parsed_addresses.property_tax_bill_id, property_tax_bill_parsed_addresses.parsed_address_id, property_tax_bill_parsed_addresses.raw_address, property_tax_bill_parsed_addresses.processed_address, property_tax_bill_parsed_addresses.created_at, property_tax_bill_parsed_addresses.updated_at, property_tax_bill_parsed_addresses.is_verified, property_tax_bill_parsed_addresses.error_message
               Recheck Cond: (property_tax_bill_parsed_addresses.parsed_address_id = 2)
               Heap Blocks: exact=1745
               Buffers: shared hit=3912
               Worker 0:  actual time=0.041..2.924 rows=1295 loops=1
                 Buffers: shared hit=1171
               Worker 1:  actual time=0.045..3.356 rows=1099 loops=1
                 Buffers: shared hit=987
               ->  Bitmap Index Scan on part_i_ptbpa_o_p_a_id_where_not_null  (cost=0.00..102.53 rows=8812 width=0) (actual time=1.049..1.049 rows=4325 loops=1)
                     Index Cond: (property_tax_bill_parsed_addresses.parsed_address_id = 2)
                     Buffers: shared hit=9
         ->  Index Scan using property_tax_bills_pkey on public.property_tax_bills  (cost=0.56..8.43 rows=1 width=147) (actual time=0.006..0.006 rows=0 loops=4325)
               Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
               Index Cond: (property_tax_bills.id = property_tax_bill_parsed_addresses.property_tax_bill_id)
               Filter: (property_tax_bills.statement_date = '2021-11-20'::date)
               Rows Removed by Filter: 1
               Buffers: shared hit=21627
               Worker 0:  actual time=0.006..0.006 rows=0 loops=1295
                 Buffers: shared hit=6476
               Worker 1:  actual time=0.007..0.007 rows=0 loops=1099
                 Buffers: shared hit=5496
 Planning:
   Buffers: shared hit=498
 Planning Time: 1.750 ms
 Execution Time: 19.000 ms

While on Production, the same command run via heroku pg:psql returns this:

 Nested Loop  (cost=0.20..7.50 rows=1 width=147) (actual time=15.895..2152.165 rows=101 loops=1)
   Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
   Inner Unique: true
   Buffers: shared hit=5581088
   ->  Index Scan using index_property_tax_bills_on_statement_date on public.property_tax_bills  (cost=0.09..3.38 rows=1 width=147) (actual time=0.034..208.051 rows=1110860 loops=1)
         Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
         Index Cond: (property_tax_bills.statement_date = '2021-11-20'::date)
         Buffers: shared hit=26788
   ->  Index Scan using i_pr_tax_bill_p_a_o_p_r_b_id on public.property_tax_bill_parsed_addresses  (cost=0.11..4.12 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=1110860)
         Output: property_tax_bill_parsed_addresses.id, property_tax_bill_parsed_addresses.property_tax_bill_id, property_tax_bill_parsed_addresses.parsed_address_id, property_tax_bill_parsed_addresses.raw_address, property_tax_bill_parsed_addresses.processed_address, property_tax_bill_parsed_addresses.created_at, property_tax_bill_parsed_addresses.updated_at, property_tax_bill_parsed_addresses.is_verified, property_tax_bill_parsed_addresses.error_message
         Index Cond: (property_tax_bill_parsed_addresses.property_tax_bill_id = property_tax_bills.id)
         Filter: (property_tax_bill_parsed_addresses.parsed_address_id = 2)
         Rows Removed by Filter: 1
         Buffers: shared hit=5554300
 Planning Time: 0.225 ms
 Execution Time: 2152.224 ms

As you can see, the query plan locally is far more complex and estimates out to be much longer. However on prod, the query plan is straightforward and intuitive, estimates quickly, but somehow actually executes over 2 orders of magnitude slower than locally. The app is running live, but with very little traffic (4-5 users, max 15 requests/minute).

Some additional information on specs of each machine:

Local has 64GB of RAM and 12 cores running on a NVME roughly capable of 5 GB/s of read/write - Postgres version 13.4

Production is a standard-9 764GB of RAM postgres instance on Heroku. - Postgres client version 13.4

CodePudding user response:

This is your problem:

Index Scan using i_pr_tax_bill_p_a_o_p_r_b_id on public.property_tax_bill_parsed_addresses (cost=0.11..4.12 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=1110860) ... Index Cond: (property_tax_bill_parsed_addresses.property_tax_bill_id = property_tax_bills.id) Filter: (property_tax_bill_parsed_addresses.parsed_address_id = 2)

Rows Removed by Filter: 1

It's doing 1110860 index scans and after successfully finding the data, removing most of it.

Add the parsed_address_id to this index, to avoid the filtering afterwards.

CREATE INDEX idx_name_of_your_index ON property_tax_bill_parsed_addresses (property_tax_bill_id,parsed_address_id);

Does the query plan change when you have this index in place?

CodePudding user response:

Thanks to Frank Heikens we discovered that a more efficient index was not being used. We had both a unique index on property_tax_bill_id AND a unique index on (property_tax_bill_id, parsed_address_id).

By deleting the redundant unique index on property_tax_bill_id and then re-adding it, it nudged the query planner into using a faster plan - here is the new resulting plan:

 Nested Loop  (cost=0.20..652.83 rows=3 width=147) (actual time=26.039..26.505 rows=101 loops=1)
   Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
   Inner Unique: true
   Buffers: shared hit=25524
   ->  Index Scan using part_i_ptbpa_o_p_a_id_where_not_null on public.property_tax_bill_parsed_addresses  (cost=0.09..208.09 rows=108 width=8) (actual time=0.026..4.929 rows=4325 loops=1)
         Output: property_tax_bill_parsed_addresses.id, property_tax_bill_parsed_addresses.property_tax_bill_id, property_tax_bill_parsed_addresses.parsed_address_id, property_tax_bill_parsed_addresses.raw_address, property_tax_bill_parsed_addresses.processed_address, property_tax_bill_parsed_addresses.created_at, property_tax_bill_parsed_addresses.updated_at, property_tax_bill_parsed_addresses.is_verified, property_tax_bill_parsed_addresses.error_message
         Index Cond: (property_tax_bill_parsed_addresses.parsed_address_id = 2)
         Buffers: shared hit=3899
   ->  Index Scan using property_tax_bills_pkey on public.property_tax_bills  (cost=0.11..4.12 rows=1 width=147) (actual time=0.005..0.005 rows=0 loops=4325)
         Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
         Index Cond: (property_tax_bills.id = property_tax_bill_parsed_addresses.property_tax_bill_id)
         Filter: (property_tax_bills.statement_date = '2021-11-20'::date)
         Rows Removed by Filter: 1
         Buffers: shared hit=21625
 Planning:
   Buffers: shared hit=20
 Planning Time: 0.593 ms
 Execution Time: 26.554 ms

It is now using other partial indices which seem to speed things up. While still not using the fastest index (the multi-column across parsed_address_id and property_tax_bill_id), the performance increase is sufficient for our needs.

This is still a bit of a mystery, as I'm not sure how the deletion and re-addition so drastically changed the query plan, but I don't want to look a gift horse in the mouth.

  •  Tags:  
  • Related