Home > Enterprise >  How to replace all occurrences of target in matched group?
How to replace all occurrences of target in matched group?

Time:02-15

I want to replace all the quoted strings in a blankets into double quoted strings in the PostgreSQL.

This is my current scripts and sample input.

select regexp_replace(
    'input string',
    '\[(("([^"] )"|''([^''''] )'')(, )?) ]',
    '["\3\4"]',
    'g'
)
["The D'Ascoyne Family: The Duke", 'The Banker', 'The Parson', 'The General', 'The Admiral', 'Young Ascoyne', 'Young Henry', 'Lady Agatha']
['Fabian', '...']
['Peter Griffin', 'Stewie Griffin', 'Brian Griffin', 'Carter Pewterschmidt']

However, it replaces only last quoted string and removes everything else.

["Lady Agatha"]
["..."]
["Carter Pewterschmidt"]

The desired outputs are like the following

["The D'Ascoyne Family: The Duke", "The Banker", "The Parson", "The General", "The Admiral", "Young Ascoyne", "Young Henry", "Lady Agatha"]
["Fabian", "..."]
["Peter Griffin", "Stewie Griffin", "Brian Griffin", "Carter Pewterschmidt"]

https://regex101.com/r/dvAIBt/1

How can I overcome this issue?

CodePudding user response:

In case your strings do not contain """ substrings, you can use

REPLACE(REGEXP_REPLACE(col, '("[^"\\]*(?:\\.[^"\\]*)*")|''([^''\\]*(?:\\.[^''\\]*)*)''', '\1"\2"'), '"""', '"')

The ("[^"\\]*(?:\\.[^"\\]*)*")|'([^'\\]*(?:\\.[^'\\]*)*)' regex matches

  • ( - start of Group 1:
  • " - a " char
  • [^"\\]* - zero or more chars other than " and \
  • (?:\\.[^"\\]*)* - zero or more sequences of a \ followed with any single char and then zero or more chars other than " and \
  • " - a " char
  • ) - end of Group 1
  • | - or
  • ' - a ' char
  • ([^'\\]*(?:\\.[^'\\]*)*) - Group 2: zero or more chars other than ' and \ and then zero or more sequences of a \ followed with any single char and then zero or more chars other than ' and \
  • ' - a ' char.

Since the "..." string literals do not need changing, they will be replaced with themselves "", so they will have """ on the right, and thus, all you need is to add a REPLACE that will replace all """s with "s.

  • Related