Home > Back-end >  String split without removing the delimiters
String split without removing the delimiters

Time:07-15

I'm trying to write a query that will split my string but won't remove delimiters.

Input:

D*(A B)-(C A)*(C A)

Output:

D
*
(
A
 
B
)
-
(
C
 
A
)
*
(
C
 
A
)

My delimiters:

 -/*)(

I tried :

select regexp_substr('D*(A B)-(C A)*(C A)', '[^ ] ', 1, level) from dual
  connect by regexp_substr('D*(A B)-(C A)*(C A)', '[^ ] ', 1, level)is not null 

But that removes delimiters.

CodePudding user response:

Assuming you have multi-character terms between the delimiters and can't just split on every character, you could wrap every delimiter in another single character - anything that can't appear in your original string, so possibly ^ (though you might use that to indicate an exponent) or as here |:

regexp_replace('D*(A B)-(C A)*(C A)', '([ */()-])', '|\1|')

which for your example gives:

D|*||(|A| |B|)||-||(|C| |A|)||*||(|C| |A|)|

... and then split on that new character instead:

select regexp_substr(
  regexp_replace('D*(A B)-(C A)*(C A)', '([ */()-])', '|\1|'),
  '[^|] ', 1, level)
from dual
connect by regexp_substr(
  regexp_replace('D*(A B)-(C A)*(C A)', '([ */()-])', '|\1|'),
  '[^|] ', 1, level) is not null
D
*
(
A
 
B
)
-
(
C
 
A
)
*
(
C
 
A
)

db<>fiddle

Normally this kind of simple split would be a slight concern as it misses null elements; but because the replace creates those where there are adjacent delimiters (so you get |*||(| etc.) that's OK here.

Possibly some edge cases it won't like though; such as a negative number.

CodePudding user response:

So, you want to get every character into its own row? If so, you don't need regular expressions, substr can handle that.

SQL> with test (col) as
  2    (select 'D*(A B)-(C A)*(C A)' from dual)
  3  select substr(col, level, 1) val
  4  from test
  5  connect by level <= length(col);

V
-
D
*
(
A
 
B
)
-
(
C
 
A
)
*
(
C
 
A
)

19 rows selected.

SQL>
  • Related