I have a postgres table with a column archived(true/false). I have a usecase where I archive the data every month so now my 90% of data is archived=true
.
When I query the table I only want to query on archive=false
99% of the time, it is taking time to execute because it does not use index when I make a join with other table or when I add multiple filters. Even if I force psql to use this index still I will not be able to make use of index on other columns when I am using filter.
I am struggling to understand how do you optimize such design so that your query is always fast.
CodePudding user response:
PostgreSQL has interesting index methods, one of these partial indexing. Partial index is a good way to save some disk space and improve record lookup performance. The common rule is "use it when you can afford it" — in other words, if all your queries involve some filter, it is usually considered a good idea to include such filter to index definition, to reduce its size and improve its performance. Here is an example how we can create partial indexing:
create index non_archive_order_date_idx
on orders (order_date)
where archive=false;
If the query we're optimizing includes the same filter on the billed column, this index will be considered by the query planner. And the filter targets quite a small part of the overall row volume in the table, we should see a significant performance improvement for the corresponding index scans.