This question concerns how to use FTS5's trigram tokenizer with Peewee.
The official FTS5 documentation for SQLite cites support for trigram tokenization/similarity:
> The experimental trigram tokenizer extends FTS5 to > support substring matching in general, instead of the > usual token matching. When using the trigram tokenizer > , a query or phrase token may match any sequence of > characters within a row, not just a complete token. > > CREATE VIRTUAL TABLE tri USING fts5(a, tokenize="trigram"); > INSERT INTO tri VALUES('abcdefghij KLMNOPQRST uvwxyz');
I've tried setting up an FTS based class with Peewee. I changed the options to use the trigram tokenizer:
class Meta: db_table = 'fts_test_db' database = test_db options = {'tokenize': 'trigram', 'content': PrecedentPW}
When I attempt to create a table with those options, this error flips up:
_db.create_tables([_fts], ) >> peewee.OperationalError: no such tokenizer: trigram
But if I change the tokenizer options to use something else (e.g. 'porter'), no errors are raised.
How can I use the trigram tokenizer with Peewee?
CodePudding user response:
You may need to compile the tokenizer yourself or ensure you are running a new enough version. The trigram tokenizer was not included by default until 3.34.0 of Sqlite: https://www.sqlite.org/releaselog/3_34_0.html