I'm using postgresql full text tsvector column. But I found a problem:
When I search for "calça" The results contains the following results:
1- calça red
2- calça blue
3- calçado red
Why "calçado" is being returned when I search for "calça" ? Is there any configuration so I can solve this? Thanks.
CodePudding user response:
It isn't just a matter that one string contains the other. The Portuguese stemmer thinks this is the way they should be stemmed. If you turn the longer word into 'calçadot', for example, it no longer stems it, because (presumably) 'adot' is not recognized as a Portuguese suffix which ought to be removed the way 'ado' is.
If you don't want stemming at all, then you could change the config to 'simple', which doesn't stem. But at that point, maybe you don't want full text search at all, and could just use LIKE instead with a pg_trgm index.
If it is just this particular word that you don't want stemmed, I think you can set up a synonym dictionary which will map calçado to itself, which will bypass stemming.