Home > other >  Querying JSONB array
Querying JSONB array

Time:12-21

Below is my sample JSONB data with few elements. I have a gin index on this column.

  {
  "customer_data": {
    "name": "abc",
    "country": [
      "xyz",
      "abc",
      "def"
    ],
    "account_details": [
      {
        "account_id": 1016084,
        "account_branch": "xyz",
        "account_balance": 2000,
        "transaction_dates": [
          20180125,
          20190125,
          20200125,
          20200525
        ]
      },
      {
        "account_id": 1016087,
        "account_branch": "abc",
        "account_balance": 12010,
        "transaction_dates": [
          20180125,
          20190125,
          20200125
        ]
      }
    ],
    "incorporated_year": 2020
  }
}

The JSONB country attribute is an array. In the application this column is multi select. I need to pick the data if any of the value in country array in the jsonb matches with any of the input country value (like IN condition in a SQL query). I get input from application as comma delimited which is split using string_to_array and based on this match against the jsonb column. Apart from the country other filter are also passed. All these i combine and dynamically form the total filter criteria. I want to return the row if it matches the passed filter.

So i tried matching for country array attribute within the JSONB against the passed in input.

Below query works. But i feel it will be slow as i need to add other filters dynamically and think doing it as a single expression will be faster

select * from customer_data_ms.test_customer WHERE customer_Details->'customer_data'->'country' 
?| array['xyz','gkl','jkl']

I want any row which has value of 'xyz' or 'gkl' or 'jkl' to be selected. I want to include it as part of the total expression as i will have other filter criteria's as well and tried below way.

select * from customer_data_ms.test_customer WHERE 
customer_Details @? '$.customer_data.country ?| array[''xyz'',''gkl'',''jkl'']'

select * from customer_data_ms.test_customer WHERE 
customer_Details @? '$.customer_data.country ?| (array[''xyz'',''gkl'',''jkl''])'

ERROR:  syntax error, unexpected $undefined, expecting '(' at or near "|" of jsonpath input
LINE 2: customer_Details @? '$.customer_data.country ?| array[''xyz'...

I am struggling on what i should do to compare the array with input array. Any guidance will be of great help.

CodePudding user response:

jsonpath is its own language with its own operators. ?| is an SQL operator, not a jsonpath one. As far as I know, you would have to write it as a series of expressions assembled with || if you want to write it in jsonpath.

CodePudding user response:

(a) I guess your first query is the only one which can work with postgres because the ?| operator is not implemented in the jsonpath language, see the manual. So the two last queries should fail.

(b) As stated in the manual :

The default GIN operator class for jsonb supports queries with top-level key-exists operators ?, ?& and ?| operators and path/value-exists operator @>.

So the basic gin index applied to the column customer_Details can't be used in your first query because the ?| operator tests the 3rd nested level elements, not the top-level elements of the customer_Details column. An other index that could be used in your first query is :

CREATE INDEX new_index ON customer_data_ms.test_customer USING GIN ((customer_Details->'customer_data'->'country'));

(c) Finally, if you add some other filter conditions in the WHERE clause of your first query, you can't determine in advance what strategy the postgres planner will use to execute the query, either using the gin index above, or using an other existing index, or scanning the full table. EXPLAIN ANALYSE will help you to understand the execution plan of your query. If it doesn't correspond to your expectation, then you can force somehow the planner to select the best index by modifying the query.

  • Related