Home > other >  regexp_substr Pattern for line splitting where separator is line break. Oracle 19c
regexp_substr Pattern for line splitting where separator is line break. Oracle 19c

Time:09-24

How to correctly compose a mask for regexp_substr in order to get data of the form where each value is in a new line at the output? I do this, but this query does not return results, in regexp_substr I use the mask '[' || chr (10) || ']'.

select ft.field_id , regexp_substr(ft.validation_data,'['||chr(10)||']', 1, rownum) str 
            from mytable ft
            where ft.validation_data is not null
connect by rownum <= regexp_count(ft.validation_data,'['||chr(10)||']');

In the validation_data field of the mytable, the data each line looks like this:

-
ПДКП
ДКП
ДКП(Знач1/Знач2)
ПДУПА
ДУПА

Or

Знач1/Знач2
Знач3/Знач4
Знач5/Знач6

Each value is separated by a line break. At the beginning of each line there can be a character - or one or more spaces. I need to get this view:

field_id str
1 -
2 ПДКП
3 ДКП
4 ДКП(Знач1/Знач2)
5 ДУПА
6 ДКП(Знач1/Знач2)
7 Знач1/Знач2

etc

CodePudding user response:

If your values do not contain whitespaces inside you can simply use negated whitespace class

select rownum, regexp_substr(tt, '\S ', 1, rownum)
from (
select '-
ПДКП
ДКП
ДКП(Знач1/Знач2)
ПДУПА
ДУПА' tt from dual
)
connect by level <= regexp_count(tt, '\S ');

https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=bd6c0270ce753da4033e79d4bd5b18c3

If can - use '[^'||chr(13)||chr(10)||'] '

CodePudding user response:

Or this:

with mytable as (
select 1 field_id,
'1asdasdad5
 2asdasd
3ФЫВФЫВ' validation_data
from dual
)
select ft.field_id, regexp_substr(ft.validation_data, '[^'||chr(10)||'] ',1,level) str
from mytable ft
where ft.validation_data is not null
connect by rownum <= regexp_count(ft.validation_data, chr(10)) 1

CodePudding user response:

Depending on the title and the CHR(10) character used, the expected result set seems wrong. Since you just want to split the strings by line feed character, no need duplicates, no need to extract the substring inside parentheses...

I presume that there is an id column coming from the table, and field_id column which is intended to be generated sequentially through use of rownum. But, the query will fail to yield decent results if rownum pseudocolumn is used to replace the value of field_id whenever the table has more than one rows. Then replace it with level keyword.

Considering all those facts, you can convert your query to the following one

 SELECT id, level AS field_id,
        REGEXP_SUBSTR(validation_data, '[^'||CHR(10)||'] ', 1, level) AS str
   FROM mytable 
CONNECT BY level <= REGEXP_COUNT(validation_data, CHR(10)) 1
    AND PRIOR SYS_GUID() IS NOT NULL
    AND PRIOR id = id        

or alternatively

 SELECT id, column_value AS field_id,
        REGEXP_SUBSTR(validation_data, '[^'||CHR(10)||'] ', 1, column_value) AS str
   FROM mytable,
        TABLE(CAST(MULTISET(SELECT level
                              FROM dual 
                           CONNECT BY level <= 
                     REGEXP_COUNT(validation_data,CHR(10)) 1) AS sys.odcivarchar2list ))

Demo

  • Related