Postgres Optimiser using parallel sequential scan instead of index scan-CodePudding

I have two tables master and a child table. So for the question I will refer master as m and child table as c.

Master table properties:

Records: 50 million
primary key (index): m_id 
btree index: modified_id

Child table properties (1-n relationship):

Records: 400 million
primary key (index): c_id
foreign key (btree index): m_id

Query plan

Gather  (cost=9159.80..6768939.59 rows=18940 width=107) (actual time=137160.885..297009.782 rows=25 loops=1)
  Output: m.<date_column>, m.m_id, m.other_unique_id_1, m.modified_id, c.<date_column>, c.c_id, c.m_id, c.<other_column_1>, c.<other_column_2>, c.<other_column_3>, c.<other_column_4>, c.<other_column_5>, c.<other_column_6>, c.<other_column_7>, c.<other_column_8>
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=11266 read=4680890
  ->  Hash Join  (cost=8159.80..6766045.59 rows=7892 width=107) (actual time=87154.268..297003.331 rows=8 loops=3)
        Output: c.<date_column>, c.m_id, c.other_unique_id_1, c.modified_id, c.<date_column>, c.c_id, c.m_id, c.<other_column_1>, c.<other_column_2>, c.<other_column_3>, c.<other_column_4>, c.<other_column_5>, c.<other_column_6>, c.<other_column_7>, c.<other_column_8>
        Inner Unique: true
        Hash Cond: (c.m_id = c.m_id)
        Buffers: shared hit=11266 read=4680890
        Worker 0: actual time=82162.730..297002.014 rows=13 loops=1
          Buffers: shared hit=3789 read=1555017
        Worker 1: actual time=42139.478..297002.514 rows=8 loops=1
          Buffers: shared hit=3634 read=1569261
        ->  Parallel Seq Scan on child c  (cost=0.00..6328357.20 rows=163629920 width=87) (actual time=0.638..279084.058 rows=130858840 loops=3)
              Output: c.<date_column>, c.c_id, c.m_id, c.<other_column_1>, c.<other_column_2>, c.<other_column_3>, c.<other_column_4>, c.<other_column_5>, c.<other_column_6>, c.<other_column_7>, c.<other_column_8>
              Buffers: shared hit=11171 read=4680887
              Worker 0: actual time=0.814..279170.920 rows=130423418 loops=1
                Buffers: shared hit=3746 read=1555017
              Worker 1: actual time=1.095..278955.836 rows=131603000 loops=1
                Buffers: shared hit=3592 read=1569260
        ->  Hash  (cost=8132.89..8132.89 rows=2153 width=20) (actual time=0.046..0.047 rows=8 loops=3)
              Output: c.<date_column>, c.m_id, c.other_unique_id_1, c.modified_id
              Buckets: 4096  Batches: 1  Memory Usage: 33kB
              Buffers: shared hit=35 read=3
              Worker 0: actual time=0.026..0.027 rows=8 loops=1
                Buffers: shared hit=13
              Worker 1: actual time=0.059..0.060 rows=8 loops=1
                Buffers: shared hit=12 read=1
              ->  Bitmap Heap Scan on master c  (cost=41.25..8132.89 rows=2153 width=20) (actual time=0.031..0.041 rows=8 loops=3)
                    Output: c.<date_column>, c.m_id, c.other_unique_id_1, c.modified_id
                    Recheck Cond: (c.modified_id = 561869)
                    Heap Blocks: exact=8
                    Buffers: shared hit=35 read=3
                    Worker 0: actual time=0.018..0.024 rows=8 loops=1
                      Buffers: shared hit=13
                    Worker 1: actual time=0.043..0.055 rows=8 loops=1
                      Buffers: shared hit=12 read=1
                    ->  Bitmap Index Scan on ix_master_modified_id  (cost=0.00..40.71 rows=2153 width=0) (actual time=0.027..0.027 rows=8 loops=3)
                          Index Cond: (c.modified_id = 561869)
                          Buffers: shared hit=11 read=3
                          Worker 0: actual time=0.015..0.015 rows=8 loops=1
                            Buffers: shared hit=5
                          Worker 1: actual time=0.038..0.038 rows=8 loops=1
                            Buffers: shared hit=4 read=1
Planning time: 0.354 ms
Execution time: 297009.825 ms

Query

select
    *
from
    master m
inner join child c on
    m.m_id = c.m_id
where
    m.modified_id = <xyz>

DDL statement for Master table

CREATE TABLE master (
    <date_column> timestamp NULL DEFAULT CURRENT_DATE,
    m_id serial4 NOT NULL,
    <other_unique_id_1> int4 NULL,
    modified_id int4 NULL,
    CONSTRAINT master_pkey PRIMARY KEY (m_id),
    CONSTRAINT <foreign key> FOREIGN KEY (<other_unique_id_1>) REFERENCES <other table>(<other_unique_id_1>)
);
CREATE INDEX ix_master_modified_id ON master USING btree (modified_id);
CREATE UNIQUE INDEX ix_master_other_unique_id_1 ON master USING btree (other_unique_id_1);

DDL for child table

CREATE TABLE child (
    <date column> timestamp NULL DEFAULT now(),
    c_id serial4 NOT NULL,
    m_id int4 NULL,
    <other_column_1> varchar(50) NULL,
    <other_column_2> varchar(50) NULL,
    <other_column_3> bool NULL,
    <other_column_4> varchar(50) NULL,
    <other_column_5> varchar(2) NULL,
    <other_column_6> varchar(50) NULL,
    <other_column_7> varchar(10) NULL,
    <other_column_8> text NULL,
    CONSTRAINT child_pkey PRIMARY KEY (c_id),
    CONSTRAINT child_master_fkey FOREIGN KEY (m_id) REFERENCES master(m_id)
);
CREATE INDEX ix_child_m_id ON child USING btree (m_id);

CodePudding user response：

I think the root of the problem is this gross misestimate.

->  Bitmap Index Scan on ix_master_modified_id  (cost=0.00..40.71 rows=2153 width=0) (actual time=0.027..0.027 rows=8 loops=3)

You said you already did VACUUM ANALYZE the table. In that case you might need to increase the stats size, and then ANALYZE again, to get a better estimate. You could either change the default_statistics_target globally, or target this one column with:

alter table master alter modified_id set statistics 10000;
analyze master;

You might not need to increase it all the way to 10000 (the max allowed), but if you only do it for one column I see no reason trying to fine tune it. Just break out the big guns right away. If it works, then you can worry about fine tuning it.