Home > front end >  Can't fit regexp for a substring
Can't fit regexp for a substring

Time:10-25

I have to find and remove a substring from the text using regexp in PostgreSQL. The substring corresponds to the condition: <any text between double-quotes containing for|while inside>

Example

Text:

PERFORM xxkkcsort.f_write_log("INFO", "xxkkcsort.f_load_tables__outof_order_system", "   Script for data loading: ", false, v_sql, 0);

So, my purpose is to find and remove the substring "Script for data loading: ". When I tried to use the script below:

SELECT regexp_replace(
   'PERFORM xxkkcsort.f_write_log("INFO", "xxkkcsort.f_load_tables__outof_order_system", "> Table for loading: "||cc.source_table_name , false, null::text, 0);'
 , '(\")(.*(for|while)(\s).*)(\")'
 , '');

I have all the texts inside double-quotes replaced. The result looks like:

PERFORM xxkkcsort.f_write_log(||cc.source_table_name , false, null::text, 0);

What's a proper regular expression to solve the issue?

CodePudding user response:

any text between double-quotes containing for|while inside

SELECT regexp_replace(string, '"[^"]*\m(?:for|while)\M[^"]*"', '');

" ... literal " (no special meaning here, so no need to escape it)

[^"]* ... character class including all characters except ", 0-n times

\m ... beginning of a word

(?:for|while) ... two branches in non-capturing parentheses
(regexp_replace() works with simple capturing parentheses, too, but it's cheaper this way since you don't use the captured substring. But try either with the replacement '\1', where it makes a difference ...)

\M ... end of a word

[^"]* ... like above

" ... like above

I dropped \s from your expression, as the task description does not strictly require a white-space character (end of string or punctuation delimiting the word ...).

Related:

CodePudding user response:

You canuse

SELECT regexp_replace(
   'PERFORM xxkkcsort.f_write_log("INFO", "xxkkcsort.f_load_tables__outof_order_system", "> Table for loading: "||cc.source_table_name , false, null::text, 0);',
   '"[^"]*(for|while)\s[^"]*"',
   '') AS Result;

Output:

PERFORM xxkkcsort.f_write_log("INFO", "xxkkcsort.f_load_tables__outof_order_system", ||cc.source_table_name , false, null::text, 0);

See the regex demo and the DB fiddle. Details:

  • " - a " char
  • [^"]* - zero or more chars other than "
  • (for|while) - for or while
  • \s - a whitespace
  • [^"]*" - zero or more chars other than " and then a " char.
  • Related