I am currently working on an Oracle SQL script that needs to split the ADDRESS field of the table below up to 5 separate rows. The said field is delimited by a record separator character (CHAR(30). I would like to ask for any recommended approach how to come up with the said data?
Please see sample data below.
| ID | ADDRESS |
|;--------|;-------------------------------------------------------|
| 1000000 | Xxxxx XxxxxXxxxx XxxxXxxxxx xx Xxxxxx |
| 1000001 | 61 Xxxxxxx XxxxXxxxxxxXxxx |
| 1000002 | 36 Xxxxx XxxXxxxxxxxxXxxxxxxxxxxxxxXxxxxxxxxxxxxxxx |
The expected output of the sample above is as follows.
| ID | ADDRESS1 | ADDRESS2 | ADDRESS3 | ADDRESS4 | ADDRESS5|
|;--------|;----------------|;-----------|;-----------------|;-----------------|;--------|
| 1000000 | Xxxxx Xxxxx | Xxxxx Xxxx | Xxxxxx xx Xxxxxx | | |
| 1000001 | 61 Xxxxxxx Xxxx | Xxxxxxx | Xxxx | | |
| 1000002 | 36 Xxxxx Xxx | Xxxxxxxxx | Xxxxxxxxxxxxxx | Xxxxxxxxxxxxxxxx | |
CodePudding user response:
Here's one option:
SQL> with test (id, address) as
2 (select 1000000, 'Xxxxx Xxxxx^^Xxxxx Xxxx^^Xxxxxx xx Xxxxxx' from dual union all
3 select 1000002, '36 Xxxxx Xxx^^Xxxxxxxxx^^Xxxxxxxxxxxxxx^^Xxxxxxxxxxxxxxxx' from dual
4 )
5 select id, regexp_substr(address, '[^' ||chr(30) ||'] ', 1, 1) val1,
6 regexp_substr(address, '[^' ||chr(30) ||'] ', 1, 2) val2,
7 regexp_substr(address, '[^' ||chr(30) ||'] ', 1, 3) val3,
8 regexp_substr(address, '[^' ||chr(30) ||'] ', 1, 4) val4,
9 regexp_substr(address, '[^' ||chr(30) ||'] ', 1, 5) val5
10 from test;
ID VAL1 VAL2 VAL3 VAL4 VAL5
---------- -------------------- -------------------- -------------------- -------------------- --------------------
1000000 Xxxxx Xxxxx Xxxxx Xxxx Xxxxxx xx Xxxxxx
1000002 36 Xxxxx Xxx Xxxxxxxxx Xxxxxxxxxxxxxx Xxxxxxxxxxxxxxxx
SQL>
CodePudding user response:
You can use simple string functions (which are an order of magnitude faster than regular expressions):
SELECT id,
CASE sep1
WHEN 0 THEN address
ELSE SUBSTR(address, 1, sep1 - 1)
END AS address1,
CASE
WHEN sep1 = 0 THEN NULL
WHEN sep2 = 0 THEN SUBSTR(address, sep1 1)
ELSE SUBSTR(address, sep1 1, sep2 - sep1 - 1)
END AS address2,
CASE
WHEN sep2 = 0 THEN NULL
WHEN sep3 = 0 THEN SUBSTR(address, sep2 1)
ELSE SUBSTR(address, sep2 1, sep3 - sep2 - 1)
END AS address3,
CASE
WHEN sep3 = 0 THEN NULL
WHEN sep4 = 0 THEN SUBSTR(address, sep3 1)
ELSE SUBSTR(address, sep3 1, sep4 - sep3 - 1)
END AS address4,
CASE
WHEN sep4 = 0 THEN NULL
ELSE SUBSTR(address, sep4 1)
END AS address5
FROM (
SELECT id,
address,
INSTR(address, CHR(30), 1, 1) AS sep1,
INSTR(address, CHR(30), 1, 2) AS sep2,
INSTR(address, CHR(30), 1, 3) AS sep3,
INSTR(address, CHR(30), 1, 4) AS sep4
FROM table_name
)
Which, for the sample data:
CREATE TABLE table_name (ID, ADDRESS) AS
SELECT 1000000, 'Xxxxx Xxxxx'||CHR(30)||'Xxxxx Xxxx'||CHR(30)||'Xxxxxx xx Xxxxxx' FROM DUAL UNION ALL
SELECT 1000001, '61 Xxxxxxx Xxxx'||CHR(30)||'Xxxxxxx'||CHR(30)||'Xxxx' FROM DUAL UNION ALL
SELECT 1000002, '36 Xxxxx Xxx'||CHR(30)||'Xxxxxxxxx'||CHR(30)||'Xxxxxxxxxxxxxx'||CHR(30)||'Xxxxxxxxxxxxxxxx' FROM DUAL UNION ALL
SELECT 1000003, 'ABC'||CHR(30)||CHR(30)||CHR(30)||'DEF'||CHR(30)||'HIJ' FROM DUAL;
Outputs:
ID ADDRESS1 ADDRESS2 ADDRESS3 ADDRESS4 ADDRESS5 1000000 Xxxxx Xxxxx Xxxxx Xxxx Xxxxxx xx Xxxxxx null null 1000001 61 Xxxxxxx Xxxx Xxxxxxx Xxxx null null 1000002 36 Xxxxx Xxx Xxxxxxxxx Xxxxxxxxxxxxxx Xxxxxxxxxxxxxxxx null 1000003 ABC null null DEF HIJ
If you did want to use (slower) regular expressions then:
SELECT id,
REGEXP_SUBSTR(address, '(.*?)('|| CHR(30) || '|$)', 1, 1, NULL, 1 ) AS address1,
REGEXP_SUBSTR(address, '(.*?)('|| CHR(30) || '|$)', 1, 2, NULL, 1 ) AS address2,
REGEXP_SUBSTR(address, '(.*?)('|| CHR(30) || '|$)', 1, 3, NULL, 1 ) AS address3,
REGEXP_SUBSTR(address, '(.*?)('|| CHR(30) || '|$)', 1, 4, NULL, 1 ) AS address4,
REGEXP_SUBSTR(address, '(.*?)('|| CHR(30) || '|$)', 1, 5, NULL, 1 ) AS address5
FROM table_name
Which outputs the same.
db<>fiddle here