Home > Back-end >  How to index a jsonb array field to perform range query on each record in the array in postgres
How to index a jsonb array field to perform range query on each record in the array in postgres

Time:08-13

I have a table reports with a jsonb field items. For each item in the report, it contains a field called score (other fields are unrelated to this question).

reports
id, items
1, [{"score": 1.0}, {"score": 2.0}]
2, [{"score": 3.0}, {"score": 9.0}]

Is it possible to add index so that it's fast for the query "give me all the reports with at least one item.score > 5"?:

SELECT id FROM reports where items @? '$[*] ? (@.score > 5.0)'; 
// will return id = 2

I have tried gin index on items but it only works for exact matches such as @.score == 5.0

CodePudding user response:

https://www.postgresql.org/docs/current/indexes-index-only-scans.html
GIN INDEX cannot index-only-scan.

https://www.postgresql.org/docs/current/datatype-json.html#JSON-INDEXING

The default GIN operator class for jsonb supports queries with the key-exists operators ?, ?| and ?&, the containment operator @>, and the jsonpath match operators @? and @@.

So @? is supported.
demo

CREATE INDEX jidx ON test1 USING gin (data jsonb_path_ops);

SET enable_seqscan = OFF;

EXPLAIN (
    COSTS OFF
)
SELECT
    *
FROM
    test1
WHERE
    data @ ? '$[*] ? (@.score > 5.0)';

Further tutorial: https://pganalyze.com/blog/gin-index

CodePudding user response:

You can define a function which returns the max score, then index that function and rewrite the query to use it.

create function max_score(jsonb) returns float language sql as $$ 
  select max((value->'score')::float) from jsonb_array_elements($1); 
$$ immutable;

create index on reports (max_score(items));

SELECT id FROM reports where max_score(items) > 5;

Obviously the max score being >5 is identical to at least one score being >5. This type of reduction won't be possible in all cases, so you should still consider a better data model.

  • Related