Home > Enterprise >  Transform JSON array to boolean columns in PostgreSQL
Transform JSON array to boolean columns in PostgreSQL

Time:02-22

I have a column that contains a JSON array of strings, which I would like to transform into boolean columns. These columns are true if the value was present in the array.

Let's say I have the following columns in Postgres.

|"countries"|
---------------
["NL", "BE"]
["UK"]

I would like to transform this into boolean columns per market. e.g.

|"BE"|"NL"|"UK"|
--------------------
|True|True|False|
|False|False|True|

I know I can manually expand it using case statements for each country code, but there are 200 countries.

Is there are more elegant solution?

CodePudding user response:

Displaying a various list of columns whose labels are known only at the runtime is not so obvious with postgres. You need some dynamic sql code.

Here is a full dynamic solution whose result is close from your expected result and which relies on the creation of a user-defined composite type and on the standard functions jsonb_populate_record and jsonb_object_agg :

First you create the list of countries as a new composite type :

CREATE TYPE country_list AS () ;
CREATE OR REPLACE PROCEDURE country_list () LANGUAGE plpgsql AS
$$
DECLARE country_list text ;
BEGIN
  SELECT string_agg(DISTINCT c.country || ' text', ',')
    INTO country_list
    FROM your_table 
   CROSS JOIN LATERAL jsonb_array_elements_text(countries) AS c(country) ;

  EXECUTE 'DROP TYPE IF EXISTS country_list' ;
  EXECUTE 'CREATE TYPE country_list AS (' || country_list || ')' ;
END ;
$$ ;

Then you can call the procedure country_list () just before executing the final query :

CALL country_list () ;

or even better call the procedure country_list () by trigger when the list of countries is supposed to be modified :

CREATE OR REPLACE FUNCTION your_table_insert_update()
RETURNS trigger LANGUAGE plpgsql VOLATILE AS
$$
BEGIN
  IF EXISTS ( SELECT 1
                FROM (SELECT jsonb_object_keys(to_jsonb(a.*)) FROM (SELECT(null :: country_list).*) AS a) AS b(key)
               RIGHT JOIN jsonb_array_elements_text(NEW.countries) AS c(country)
                  ON c.country = b.key
               WHERE b.key IS NULL
            )
  THEN CALL country_list () ;
  END IF ;
  RETURN NEW ;
END ;
$$ ;

CREATE OR REPLACE TRIGGER your_table_insert_update AFTER INSERT OR UPDATE OF countries ON your_table 
FOR EACH ROW EXECUTE FUNCTION your_table_insert_update() ;

CREATE OR REPLACE FUNCTION your_table_delete()
RETURNS trigger LANGUAGE plpgsql VOLATILE AS
$$
BEGIN
  CALL country_list () ;
  RETURN OLD ;
END ;
$$ ;

CREATE OR REPLACE TRIGGER your_table_delete AFTER DELETE ON your_table 
FOR EACH ROW EXECUTE FUNCTION your_table_delete() ;

Finally, you should get the expected result with the following query, except that the column label are lower case, and NULL is replacing false in the result :

SELECT (jsonb_populate_record(NULL :: country_list, jsonb_object_agg(lower(c.country), true))).*
  FROM your_table AS t
 CROSS JOIN LATERAL jsonb_array_elements_text(t.countries) AS c(country)
 GROUP BY t

full test result in dbfiddle.

  • Related