How do you create index that supports Or operations in Postgres-CodePudding

I have no experience at all with sql/postgres indexing so I'm even unsure how to formulate my question correctly. But here goes...

In one of our JpaRepositories we have a query that looks like this:

    @Query(nativeQuery = true, value =
            "SELECT * FROM instrument i "  
                    "left join instrument_profile ip ON ip.instrument_id = i.id "  
                    "WHERE (ip.profile IS NULL or upper(ip.profile) = :assetMgr)"  
                    " and i.transaction_type = :transactionType " 
                    "  and i.customer_id = :customerId "  
                    "  and ( i.bloomberg_id ilike %:key% OR "  
                    "       i.instrument_short ilike %:key% or "  
                    "       i.instrument_name ilike %:key% OR "  
                    "       i.cusip ilike %:key% OR "  
                    "       i.sedol ilike %:key%) "  
                    "limit :limit")

The query works find but we are now looking for ways to optimize performance overall and table indexing is one of them. The thing is.. I have no clue how to index a table for this kind of query. Partly because it contains a join table, and also we are searching for a "search-key" value in multiple fields. This particular table is expected to contain a lot of records so if we cant create an index that supports the above, we could be forced to change the entire query.

Any suggestions or pointers on how to approach this would be very much appreciated.

/Kat

CodePudding user response：

The Join columns should be indexed to get better performance. You can try creating index on instrument_id in instrument_profile table and instrument table id column like below

CREATE INDEX instrument_profile_instrument_id_idx ON instrument_profile (instrument_id);
CREATE INDEX instrument_instrument_id_idx ON instrument (id);

CodePudding user response：

I am thinking of concatenating the search fields into a separate column that I can index with pg_trgm and use in query. At this point it seems like a way out

You could do that, but you could also just make a pg_trgm index on the 5 separate columns. I would tend to think this is better than an index on a concatenated field. You can write the queries in a more natural way, if you sometimes want you omit one of the columns you can do. It also saves space over duplicating the data in the table.

create index on instrument using gin (bloomberg_id gin_trgm_ops, instrument_short gin_trgm_ops, instrument_name gin_trgm_ops, cusip gin_trgm_ops, sedol gin_trgm_ops);

I would also create the regular (btree) index on the other 2 columns:

create index on instrument (transaction_type , customer_id);

Depending on the row estimates, it might decide to combine these two indexes with a BitmapAnd, or it might decide to just use one or the other and then filter on whatever criteria it choose not to use the index for.