Optimize a postgres query where data is limited by a column-CodePudding

I have a postgres table with a column archived(true/false). I have a usecase where I archive the data every month so now my 90% of data is archived=true.

When I query the table I only want to query on archive=false 99% of the time, it is taking time to execute because it does not use index when I make a join with other table or when I add multiple filters. Even if I force psql to use this index still I will not be able to make use of index on other columns when I am using filter.

I am struggling to understand how do you optimize such design so that your query is always fast.

CodePudding user response：

PostgreSQL has interesting index methods, one of these partial indexing. Partial index is a good way to save some disk space and improve record lookup performance. The common rule is "use it when you can afford it" — in other words, if all your queries involve some filter, it is usually considered a good idea to include such filter to index definition, to reduce its size and improve its performance. Here is an example how we can create partial indexing:

create index non_archive_order_date_idx
on orders (order_date)
where archive=false;

If the query we're optimizing includes the same filter on the billed column, this index will be considered by the query planner. And the filter targets quite a small part of the overall row volume in the table, we should see a significant performance improvement for the corresponding index scans.