Home > other >  Postgres full text search dictionary strip special charachters
Postgres full text search dictionary strip special charachters

Time:02-26

I'm using postgres full text search for (amoung other things) to provide autocomplete functionality for usernames and tags. However, I'd like autocomplete to match the column value 'dashed-tag-example' against a ts_query like 'dashedtag:*'.

My understanding is that, to do this without duplicating the column in my table I need to create a dictionary along the lines of the simple dictionary that strips charachters like '-'. Is it possible to create such a dictionary using SQL (i.e. something I could put in a rails migration)?

It seems like it should somehow be possible to define a dictionary (or do I need a parser?) that uses postgres's regexp substition functions but I can't seem to find any examples online of how to create a dictionary (parser?) like that. Is this possible? How?

CodePudding user response:

The dictionary is too late; you would need a different parser, which would require writing C code.

The simple and pragmatic solution is to use replace() to strip the - when you construct the tsvector.

You don't need to create a new column for that, simply search like this:

SELECT ... FROM ...
WHERE to_tsvector('english', replace(col, '-', ''))
      @@ to_tsquery('english', replace('search-string', '-' ''));
  • Related