Home > Software engineering >  Optimize a postgres query where data is limited by a column
Optimize a postgres query where data is limited by a column

Time:02-24

I have a postgres table with a column archived(true/false). I have a usecase where I archive the data every month so now my 90% of data is archived=true.

When I query the table I only want to query on archive=false 99% of the time, it is taking time to execute because it does not use index when I make a join with other table or when I add multiple filters. Even if I force psql to use this index still I will not be able to make use of index on other columns when I am using filter.

I am struggling to understand how do you optimize such design so that your query is always fast.

CodePudding user response:

PostgreSQL has interesting index methods, one of these partial indexing. Partial index is a good way to save some disk space and improve record lookup performance. The common rule is "use it when you can afford it" — in other words, if all your queries involve some filter, it is usually considered a good idea to include such filter to index definition, to reduce its size and improve its performance. Here is an example how we can create partial indexing:

create index non_archive_order_date_idx
on orders (order_date)
where archive=false;

If the query we're optimizing includes the same filter on the billed column, this index will be considered by the query planner. And the filter targets quite a small part of the overall row volume in the table, we should see a significant performance improvement for the corresponding index scans.

  • Related