Home > Back-end >  How to remove apostrophe from JSON array value in Redshift table?
How to remove apostrophe from JSON array value in Redshift table?

Time:03-02

I have a JSON string in a column, for e.g. [{'id': 2746, 'name': "HaZore'a", 'lat': Decimal('32.644516'), 'lon': Decimal('35.11918')}] of my RS table. My requirement is to remove this single quote from the name value (HaZore'a) and then replace all the single quotes with double quotes and also remove the square brackets since JSON only recognizes double quotes. I tried doing this

replace(replace(left(right((replace(replace(city, '\'', '"'), 'None', '"None"')),
                                              len((replace(replace(city, '\'', '"'), 'None', '"None"'))) - 1),
                                        len((replace(replace(city, '\'', '"'), 'None', '"None"'))) - 2), '\'',
                                   '"'), 'Decimal(', ''), ')', '')

However it leaves me with a double quote in the name value like this {"id": 2746, "name": "HaZore"a", "lat": "32.644516", "lon": "35.11918"}.

How should I update my above statement to eliminate that apostrophe from the name value in the column. I have several rows where the name value has apostrophes.

Please help.

Thanks Ajinkya

CodePudding user response:

I think you will need to shift to regexp_replace() to pattern match this more correctly. See: https://docs.aws.amazon.com/redshift/latest/dg/REGEXP_REPLACE.html

Since this text is json-like your double quoted text with a single quote in it will only have alpha chars in it (besides the single quote). You can use this to pattern match.

I got a postgres fiddle up to demonstrate (see comments). Now Redshift and postgres escape single quotes in text differently so there will be a little shift needed but I'll get to that in a sec.

Data setup:

create table test as 
select '[{''id'': 2746, ''name'': "HaZore''a", ''lat'': Decimal(''32.644516''), ''lon'': Decimal(''35.11918'')}]'''::text 
as txt;

SQL query:

select replace(regexp_replace(txt, '("[[:alpha:]]*)''([[:alpha:]]*")', '\1-\2'), '''', '"') 
from test;

I changed the apostrophe to a dash to make things clearer. Feel free to change this or eliminate it as you need.

Now to escape the single quotes in Redshift I believe you need to backslash them. So the query changes to:

select replace(regexp_replace(txt, '("[[:alpha:]]*)\'([[:alpha:]]*")', '\1-\2'), '\'', '"') 
from test;

As I haven't fired up a Redshift to test in that environment please forgive any errors.

  • Related