Home > Enterprise >  Redshift how to split a stringified array into separate parts
Redshift how to split a stringified array into separate parts

Time:09-13

Say I have a varchar column let's say religions that looks like this: ["Christianity", "Buddhism", "Judaism"] (yes it has a bracket in the string) and I want the string (not array) split into multiple rows like "Christianity", "Buddhism", "Judaism" so it can be used in a WHERE clause.

Eventually I want to use the results of the query in a where clause like this:

SELECT ...
FROM religions
WHERE name in 
    (
        <this subquery>
    )

How can one do this?

CodePudding user response:

You can use the function JSON_PARSE to convert the varchar string into an array. Then you can use the strategy described in Convert varchar array to rows in redshift - Stack Overflow to convert the array to separate rows.

CodePudding user response:

You can do the following.

  1. Create a temporary table with sequence of numbers
  2. Using the sequence and split_part function available in redshift, you can split the values based on the numbers generated in the temporary table by doing a cross join.
  3. To replace the double quote and square brackets, you can use the regexp_replace function in Redshift.
create temp table seq as
with recursive numbers(NUMBER) as
(
select 1 UNION ALL
select NUMBER   1 from numbers where NUMBER < 28
)
select * from numbers;

select regexp_replace(split_part(val,',',seq.number),'[]["]','') as value
from
(select '["christianity","Buddhism","Judaism"]' as val) -- You can select the actual column from the table here.
cross join
seq
where seq.number <= regexp_count(val,'[,]') 1;
  • Related