Home > Net >  Remove template text on regexp_replace in Oracle's SQL
Remove template text on regexp_replace in Oracle's SQL

Time:09-15

I am trying to remove template text like &#x; or &#xx; or &#xxx; from long string

Note: x / xx / xxx - is number, The length of the number is unknown, The cell type is CLOB

for example:

SELECT 'H'ello wor±ld' FROM dual

A desirable result:

Hello world

I know that regexp_replace should be used, But how do you use this function to remove this text?

CodePudding user response:

You can use

SELECT REGEXP_REPLACE(col,'&&#\d ;')       
  FROM t

where

  • & is put twice to provide escaping for the substitution character
  • \d represents digits and the following provides the multiple occurrences of them
  • ending the pattern with ;

or just use a single ampersand ('&#\d ;') for the pattern as in the case of Demo , since an ampersand has a special meaning for Oracle, a usage is a bit problematic.

CodePudding user response:

In case you wanted to remove the entities because you don't know how to replace them by their character values, here is a solution:

UTL_I18N.UNESCAPE_REFERENCE( xmlquery( 'the_double_quoted_original_string' RETURNING content).getStringVal() ) 

In other words, the original 'H'ello wor±ld' should be passed to XMLQUERY as '"H'ello wor±ld"'.

And the result will be 'H'ello wo±ld'

  • Related