Home > front end >  Filtering out objects from multiple arrays in a JSONB column
Filtering out objects from multiple arrays in a JSONB column

Time:12-18

I have a JSON structure with two arrays saved in a JSONB column. A bit simplified it looks like this

{
  "prop1": "abc",
  "prop2": "xyz",
  "items": [
    {
      "itemId": "123",
      "price": "10.00"
    },
    {
      "itemId": "124",
      "price": "9.00"
    },
    {
      "itemId": "125",
      "price": "8.00"
    }
  ],
  "groups": [
    {
      "groupId": "A",
      "discount": "20",
      "discountId": "1"
    },
    {
      "groupId": "B",
      "discount": "30",
      "discountId": "2"
    },
    {
      "groupId": "B",
      "discount": "20",
      "discountId": "3"
    },
    {
      "groupId": "C",
      "discount": "40",
      "discountId": "4"
    }
  ]
}

Schema:

CREATE TABLE campaign
  (
     id       TEXT PRIMARY KEY,
     data     JSONB
  );

Since each row (data column) can be fairly large, I'm trying to filter out matching item objects and group objects from the items and groups arrays.

My current query is this

SELECT * FROM campaign
WHERE 
(data -> 'items' @> '[{"productId": "123"}]') OR
(data -> 'groups' @> '[{"groupId": "B"}]')

which returns rows containing either the matching group or the matching item. However, depending on the row, the data column can be a fairly large JSON object (there may be hundreds of objects in items and tens in groups and I've omitted several keys/properties for brevity in this example) which is affecting query performance (I've added GIN indexes on the items and groups arrays, so missing indices is not why it's slow).

How can I filter out the items and groups arrays to only contain matching elements?

Given this matching row

{
  "prop1": "abc",
  "prop2": "xyz",
  "items": [
    {
      "itemId": "123",
      "price": "10.00"
    },
    {
      "itemId": "124",
      "price": "9.00"
    },
    {
      "itemId": "125",
      "price": "8.00"
    }
  ],
  "groups": [
    {
      "groupId": "A",
      "discount": "20",
      "discountId": "1"
    },
    {
      "groupId": "B",
      "discount": "30",
      "discountId": "2"
    },
    {
      "groupId": "B",
      "discount": "20",
      "discountId": "3"
    },
    {
      "groupId": "C",
      "discount": "40",
      "discountId": "4"
    }
  ]
}

I'd like the result to be something like this (the matching item/group could be in different columns from the rest of the data column - doesn't have to be returned in a single JSON object with two arrays like this, but I would prefer it if doesn't affect performance or lead to a really hairy query):

{
  "prop1": "abc",
  "prop2": "xyz",
  "items": [
    {
      "itemId": "123",
      "price": "10.00"
    }
  ],
  "groups": [
    {
      "groupId": "B"
      "discount": "20",
      "discountId": "3"
    }
  ]
}

What I've managed to do so far is unwrap and match an object in the items array using this query, which removes the 'items' array from the data column and filters out the matching item object to a separate column, but I'm struggling to join this with matches in the groups array.

SELECT data - 'items', o.obj
FROM campaign c
CROSS JOIN LATERAL jsonb_array_elements(c.data #> '{items}') o(obj)
WHERE o.obj ->> 'productId' = '124'

How can I filter both arrays in one query?

Bonus question: For the groups array I also want to return the object with the lowest discount value if possible. Or else the result would need to be an array of matching group objects instead of a single matching group.

Related questions: How to filter jsonb array elements and How to join jsonb array elements in Postgres?

CodePudding user response:

If your postgres version is 12 or more, you can use the jsonpath language and functions. The query below returns the expected result with the subset of items and groups which match the given criteria. Then you can adapt this query within a sql function so that the search criteria is an input parameter.

SELECT jsonb_set(jsonb_set( data
                          , '{items}'
                          , jsonb_path_query_array(data, '$.items[*] ? (@.itemId == "123" && @.price == "10.00")'))
                , '{groups}'
                , jsonb_path_query_array(data, '$.groups[*] ? (@.groupId == "B" && @.discount == "20" && @.discountId == "3")'))
  FROM (SELECT
'{
  "prop1": "abc",
  "prop2": "xyz",
  "items": [
    {
      "itemId": "123",
      "price": "10.00"
    },
    {
      "itemId": "124",
      "price": "9.00"
    },
    {
      "itemId": "125",
      "price": "8.00"
    }
  ],
  "groups": [
    {
      "groupId": "A",
      "discount": "20",
      "discountId": "1"
    },
    {
      "groupId": "B",
      "discount": "30",
      "discountId": "2"
    },
    {
      "groupId": "B",
      "discount": "20",
      "discountId": "3"
    },
    {
      "groupId": "C",
      "discount": "40",
      "discountId": "4"
    }
  ]
}' :: jsonb) AS d(data)
WHERE jsonb_path_exists(data, '$.items[*] ? (@.itemId == "123" && @.price == "10.00")')
  AND jsonb_path_exists(data, '$.groups[*] ? (@.groupId == "B" && @.discount == "20" && @.discountId == "3")')
  • Related