Home > other >  Postgres Gin Index read performance
Postgres Gin Index read performance

Time:09-08

The official doc states that:

One advantage of the separate-column approach over an expression index is that ... Another advantage is that searches will be faster, since it will not be necessary to redo the to_tsvector calls to verify index matches.

Why does a gin expression index to_tsvector('english', body) has to "verify index matches"? It seems that index are automatically updated after every update/insert, All indices have same update issue and this might not be the point to be concerned.

CodePudding user response:

I think this deals with the "recheck" which is necessary, since the GIN index scan is potentially lossy: it will return values that contain all elements from the tsvector you search for. All these rows get rechecked to see if they really match the tsquery. That means that the to_tsvector function is evaluated for all rows that are returned by the index scan.

CodePudding user response:

As the docs say, that is more important for GIST than for GIN.

GIN indexes can still need to be rechecked if work_mem is too small to hold the entire bitmap so they go lossy. They will also need to be rechecked if the pattern uses relative position indicators like <->, <2> etc.

It might also need rechecking if you have many &ed together tokens and it just decides to recheck the more common of them rather than bothering with all the bitmaps for them (I'm not sure if actually does this here or not--I've never witnessed it for @@ but without having inspected the entire code I can't rule out the possibility) or maybe if you have complicated boolean tsquery expressions.

  • Related