From a space delimited string, i would like to remove all words that are long from 1 to 3 characters.
For example: this string
LCCPIT A2 LCCMAD B JBPM_JIT CCC
should become
LCCPIT LCCMAD JBPM_JIT
So, A2, B and CCC words are removed (since they are long 2, 1 and 3 characters). Is there a way to do it? I think i could use REGEXP_REPLACE
, but i didn't find the correct regular expression to have this result.
CodePudding user response:
Split string to words and aggregate back only these substrings whose length is greater than 3.
Sample data:
SQL> with test (col) as
2 (select 'LCCPIT A2 LCCMAD B JBPM_JIT CCC' from dual)
Query begins here:
3 select listagg(val, ' ') within group (order by lvl) result
4 from (select regexp_substr(col, '[^ ] ', 1, level) val,
5 level lvl
6 from test
7 connect by level <= regexp_count(col, ' ') 1
8 )
9 where length(val) > 3;
RESULT
--------------------------------------------------------------------------------
LCCPIT LCCMAD JBPM_JIT
SQL>
CodePudding user response:
I prefer a regex replacement trick:
SELECT TRIM(REGEXP_REPLACE(val, '(^|\s )\w{1,3}(\s |$)', ' '))
FROM dual;
-- output is 'LCCPIT LCCMAD JBPM_JIT'
Demo
The strategy above is match any 1, 2, or 3 letter word, along with any surrounding whitespace, and to replace with just a single space. The outer call to TRIM()
is necessary to remove dangling spaces which might arise from the first or last word being removed.