Home > Net >  How to speed up SELECT for a JSONB column in Postgres when the first level key is unknown?
How to speed up SELECT for a JSONB column in Postgres when the first level key is unknown?

Time:01-05

I have a table with a JSONB column called "attributes" that contains a JSON object with various keys and values. The keys are dynamic and I do not know their names until the time of the query. I have over 20 million rows in this table and the queries on this column are currently very slow. Is there a way to improve the search performance in this scenario without using dynamically generated indexes?

How my data stored:

attributes
JSONB

JSON looks like this:

{
  dynamicName1: 'value',
  dynamicName2: 'value',
  dynamicName3: 'value',
  ...
}

Example of query:

SELECT * FROM table WHERE "attributes" ->> 'dynamicName1' = 'SomeValue'
SELECT * FROM table WHERE "attributes" ->> 'abcdefg' = 'SomeValue'
SELECT * FROM table WHERE "attributes" ->> 'anyPossibleName' = 'SomeValue'

Create table:

CREATE TABLE "table" ("id" SERIAL NOT NULL, "attributes" JSONB)

Explain:

Gather  (cost=1000.00..3460271.08 rows=91075 width=1178)
  Workers Planned: 2
"  ->  Parallel Seq Scan on ""table""  (cost=0.00..3450163.58 rows=37948 width=1178)"
"        Filter: ((""attributes"" ->> 'Beak'::text) = 'Yellow'::text)"

I have attempted to research the use of indexes to improve search performance on JSONB columns, but have been unable to find any information that specifically addresses my scenario where the keys in the JSON object are dynamic and unknown until the time of the query.

CodePudding user response:

You don't need to specify the keys within the jsonb object to build a useful index on its column.

create index on "table" using gin("attributes" jsonb_path_ops);

and then use @@jsonpath or @>jsonb operators that are supported by GIN. You can omit the jsonb_path_ops operator class if you'll need to use other operators with this index.

select * from "table" where "attributes" @@ '$.dynamicName1 == "SomeValue"';
select * from "table" where "attributes" @> '{"dynamicName1":"SomeValue"}'::jsonb;

Online demo where this speeds things up about three orders of magnitude on 400k random records.

  • Related