Home > Enterprise >  Avoid scan on attach partition with check constraint
Avoid scan on attach partition with check constraint

Time:07-01

I am recreating an existing table as a partitioned table in PostgreSQL 11.

After some research, I am approaching it using the following procedure so this can be done online while writes are still happening on the table:

  1. add a check constraint on the existing table, first as not valid and then validating
  2. drop the existing primary key
  3. rename the existing table
  4. create the partitioned table under the prior table name
  5. attach the existing table as a partition to the new partitioned table

My expectation was that the last step would be relatively fast, but I don't really have a number for this. In my testing, it's taking about 30s. I wonder if my expectations are incorrect or if I'm doing something wrong with the constraint or anything else.

Here's a simplified version of the DDL.

First, the inserted_at column is declared like this:

inserted_at timestamp without time zone not null

I want to have an index on the ID even after I drop the PK for existing queries and writes, so I create an index:

create unique index concurrently my_events_temp_id_index on my_events (id);

The check constraint is created in one transaction:

alter table my_events add constraint my_events_2022_07_events_check
check (inserted_at >= '2018-01-01' and inserted_at < '2022-08-01')
not valid;

In the next transaction, it's validated (and the validation is successful):

alter table my_events validate constraint my_events_2022_07_events_check;

Then before creating the partitioned table, I drop the primary key of the existing table:

alter table my_events drop constraint my_events_pkey cascade;

Finally, in its own transaction, the partitioned table is created:

alter table my_events rename to my_events_2022_07;

create table my_events (
  id uuid not null,
  ... other columns,
  inserted_at timestamp without time zone not null,
  primary key (id, inserted_at)
) partition by range (inserted_at);

alter table my_events attach partition my_events_2022_07
for values from ('2018-01-01') to ('2022-08-01');

That last transaction blocks inserts and takes about 30s for the 12M rows in my test database.

Edit

I wanted to add that in response to the attach I see this:

INFO: partition constraint for table "my_events_2022_07" is implied by existing constraints

That makes me think I'm doing this right.

CodePudding user response:

The problem is not the check constraint, it is the primary key.

If you make the original unique index include both columns:

create unique index concurrently my_events_temp_id_index on my_events (id,inserted_at);

And if you make the new table have a unique index rather than a primary key on those two columns, then the attach is nearly instantaneous.

These seem to me like unneeded restrictions in PostgreSQL, both that the unique index on one column can't be used to imply uniqueness on the both columns, and that the unique index on both columns cannot be used to imply the primary key (nor even a unique constraint--but only a unique index).

  • Related