I have to find and remove a substring from the text using regexp in PostgreSQL. The substring corresponds to the condition: <any text between double-quotes containing for|while inside>
Example
Text:
PERFORM xxkkcsort.f_write_log("INFO", "xxkkcsort.f_load_tables__outof_order_system", " Script for data loading: ", false, v_sql, 0);
So, my purpose is to find and remove the substring "Script for data loading: ". When I tried to use the script below:
SELECT regexp_replace(
'PERFORM xxkkcsort.f_write_log("INFO", "xxkkcsort.f_load_tables__outof_order_system", "> Table for loading: "||cc.source_table_name , false, null::text, 0);'
, '(\")(.*(for|while)(\s).*)(\")'
, '');
I have all the texts inside double-quotes replaced. The result looks like:
PERFORM xxkkcsort.f_write_log(||cc.source_table_name , false, null::text, 0);
What's a proper regular expression to solve the issue?
CodePudding user response:
any text between double-quotes containing for|while inside
SELECT regexp_replace(string, '"[^"]*\m(?:for|while)\M[^"]*"', '');
"
... literal "
(no special meaning here, so no need to escape it)
[^"]*
... character class including all characters except "
, 0-n times
\m
... beginning of a word
(?:for|while)
... two branches in non-capturing parentheses
(regexp_replace()
works with simple capturing parentheses, too, but it's cheaper this way since you don't use the captured substring. But try either with the replacement '\1'
, where it makes a difference ...)
\M
... end of a word
[^"]*
... like above
"
... like above
I dropped \s
from your expression, as the task description does not strictly require a white-space character (end of string or punctuation delimiting the word ...).
Related:
CodePudding user response:
You canuse
SELECT regexp_replace(
'PERFORM xxkkcsort.f_write_log("INFO", "xxkkcsort.f_load_tables__outof_order_system", "> Table for loading: "||cc.source_table_name , false, null::text, 0);',
'"[^"]*(for|while)\s[^"]*"',
'') AS Result;
Output:
PERFORM xxkkcsort.f_write_log("INFO", "xxkkcsort.f_load_tables__outof_order_system", ||cc.source_table_name , false, null::text, 0);
See the regex demo and the DB fiddle. Details:
"
- a"
char[^"]*
- zero or more chars other than"
(for|while)
-for
orwhile
\s
- a whitespace[^"]*"
- zero or more chars other than"
and then a"
char.