I am trying to remove template text like &#x; or &#xx; or &#xxx; from long string
Note: x / xx / xxx - is number, The length of the number is unknown, The cell type is CLOB
for example:
SELECT 'H'ello wor±ld' FROM dual
A desirable result:
Hello world
I know that regexp_replace should be used, But how do you use this function to remove this text?
CodePudding user response:
You can use
SELECT REGEXP_REPLACE(col,'&&#\d ;')
FROM t
where
&
is put twice to provide escaping for the substitution character\d
represents digits and the following- ending the pattern with
;
or just use a single ampersand ('&#\d ;'
) for the pattern as in the case of Demo , since an ampersand has a special meaning for Oracle, a usage is a bit problematic.
CodePudding user response:
In case you wanted to remove the entities because you don't know how to replace them by their character values, here is a solution:
UTL_I18N.UNESCAPE_REFERENCE( xmlquery( 'the_double_quoted_original_string' RETURNING content).getStringVal() )
In other words, the original 'H'ello wor±ld'
should be passed to XMLQUERY as '"H'ello wor±ld"'
.
And the result will be 'H'ello wo±ld'