Consider the following situation:
- I have a large PostgreSQL table with a primary key of type
UUID
. The UUIDs are generated randomly and spread uniformly across the UUID space. - I partition the table on this UUID column on 256 ranges (e.g. based on the first 8 bits of the UUID).
- All partitions are stored on the same physical disk.
Basically this means all the 256 partitions will be equally used (unlike with time-based paritionning where the most recent parititon would normally be hotter than the other ones).
Will I see any performance improvement at all by doing this type of partitioning:
- For queries based on the UUID, returning a single row (
WHERE uuid_key = :id
)? - For other queries that must search all partitions?
CodePudding user response:
Will I see any performance improvement at all by doing this type of partitioning:
For queries based on the UUID, returning a single row (WHERE uuid_key = :id)?
Yes: Postgresql will search only in the right partition. Also you can gain performances in insert or update, reducing page contention.
- For other queries that must search all partitions?
Not really, but index desing can minimize the problem.
CodePudding user response:
Most queries will become slower. For example, if you search by uuid_key
, the optimizer has to determine which partition to search, something that grows in expense with the number of partitions. The index scan itself will not be notably faster on a small table than on a big table.
You could benefit if you have several tables partitioned alike and you join them on the partitioning key, so that you get a partitionwise join (but remember to set enable_partitionwise_join = on
). There are similar speed gains for partitionwise aggregates.
Even though you cannot expect a performance gain for your query, partitioning may still have its use, for example if you need several autovacuum workers to process a single table.