I have an articles
table with these (relevant) fields
CREATE TABLE
IF NOT EXISTS articles (
...
...
title TEXT NOT NULL,
content JSONB NOT NULL,
search_index TSVECTOR NOT NULL,
...
...
);
with an index on search_index
CREATE INDEX articles_search_index ON articles USING gin(search_index);
And I want to be able to search both title
and content
without any particular priority.
I want to treat entire content
data as a blob of text without actually changing its structure.
What I'd hope would work:
CREATE TRIGGER articles_search_index_update BEFORE INSERT
OR
UPDATE
ON articles FOR EACH ROW EXECUTE FUNCTION
tsvector_update_trigger
(search_index,
'pg_catalog.english',
title,
content
);
Error - column \"content\" is not of a character type
which is understandable since tsvector_update_trigger
expects text columns.
Is it possible to have jsonb
content in that function?
CodePudding user response:
You could cast the entire json to a string by using content::text
instead of content
in the update trigger, but then all the brackets, keys and commas of the json would also get thrown in which could be problematic. If content always contains the same simple properties it will be much cleaner to manually extract the (relevant) properties, cast them to text where necessary and concatenating them together instead.
CodePudding user response:
You can use a generated column with jsonb_to_tsvector()
. As you also want to index the title, you will need to concatenate the content
column with a JSON value containing the title.
Something along the lines:
CREATE TABLE IF NOT EXISTS articles
(
...
title TEXT NOT NULL,
content JSONB NOT NULL,
search_index TSVECTOR generated always as
(jsonb_to_tsvector('english', content||jsonb_object(array['title', title]), '"all"')) stored
);
Note that you can not use jsonb_build_object()
as that is unfortunately not marked as "immutable", but jsonb_object()
is.
If the JSON stored in the content
column can contain a top-level key title
, you will need to use a different key, maybe jsonb_object(array['__title', title])
.
One advantage I can see (apart from not having to manage the trigger) is, that this only indexes the actual JSON values not the keys, so I would expect this to result in more accurate search results.