I want to replace all the quoted strings in a blankets into double quoted strings in the PostgreSQL.
This is my current scripts and sample input.
select regexp_replace(
'input string',
'\[(("([^"] )"|''([^''''] )'')(, )?) ]',
'["\3\4"]',
'g'
)
["The D'Ascoyne Family: The Duke", 'The Banker', 'The Parson', 'The General', 'The Admiral', 'Young Ascoyne', 'Young Henry', 'Lady Agatha']
['Fabian', '...']
['Peter Griffin', 'Stewie Griffin', 'Brian Griffin', 'Carter Pewterschmidt']
However, it replaces only last quoted string and removes everything else.
["Lady Agatha"]
["..."]
["Carter Pewterschmidt"]
The desired outputs are like the following
["The D'Ascoyne Family: The Duke", "The Banker", "The Parson", "The General", "The Admiral", "Young Ascoyne", "Young Henry", "Lady Agatha"]
["Fabian", "..."]
["Peter Griffin", "Stewie Griffin", "Brian Griffin", "Carter Pewterschmidt"]
https://regex101.com/r/dvAIBt/1
How can I overcome this issue?
CodePudding user response:
In case your strings do not contain """
substrings, you can use
REPLACE(REGEXP_REPLACE(col, '("[^"\\]*(?:\\.[^"\\]*)*")|''([^''\\]*(?:\\.[^''\\]*)*)''', '\1"\2"'), '"""', '"')
The ("[^"\\]*(?:\\.[^"\\]*)*")|'([^'\\]*(?:\\.[^'\\]*)*)'
regex matches
(
- start of Group 1:"
- a"
char[^"\\]*
- zero or more chars other than"
and\
(?:\\.[^"\\]*)*
- zero or more sequences of a\
followed with any single char and then zero or more chars other than"
and\
"
- a"
char)
- end of Group 1|
- or'
- a'
char([^'\\]*(?:\\.[^'\\]*)*)
- Group 2: zero or more chars other than'
and\
and then zero or more sequences of a\
followed with any single char and then zero or more chars other than'
and\
'
- a'
char.
Since the "..."
string literals do not need changing, they will be replaced with themselves ""
, so they will have """
on the right, and thus, all you need is to add a REPLACE
that will replace all """
s with "
s.