I have a table:
ID | Text |
---|---|
1 | A1 Text - words |
2 | A1 Text - wordsa |
3 | B1 Sentence - sentence |
4 | B1 Sentence - sentence b |
What I'm trying to achieve:
ID | Text |
---|---|
1 | A1 Text - words, A1 Text - wordsa |
2 | B1 Sentence - sentence, B1 Sentence - sentence b |
I'm not sure how to approach this or is there a function that would deal with this? I need at least a suggestion to get started.
EDIT - I updated the table slighty. As ofcourse I don't want to aggregate the entire table into 1 row, hopefully the tables show what I'm trying to say more clearly.
CodePudding user response:
Option 1:
If you want to find chains of successive rows with where the start of the string matches the next string then you can use a hierarchical function and collate the rows using SYS_CONNECT_BY_PATH
:
SELECT CONNECT_BY_ROOT id AS id,
SUBSTR(
SYS_CONNECT_BY_PATH(Text, ', '),
3
) AS Text
FROM table_name
WHERE CONNECT_BY_ISLEAF = 1
START WITH
id IN (SELECT id
FROM table_name
WHERE CONNECT_BY_ISLEAF = 1
AND LEVEL = 1
CONNECT BY NOCYCLE
PRIOR text LIKE text || '%'
)
CONNECT BY NOCYCLE
text LIKE PRIOR text || '%';
Which, for the sample data:
CREATE TABLE table_name (ID, Text) AS
SELECT 1, 'A1 Text - words' FROM DUAL UNION ALL
SELECT 2, 'A1 Text - wordsa' FROM DUAL UNION ALL
SELECT 3, 'B1 Sentence - sentence' FROM DUAL UNION ALL
SELECT 4, 'B1 Sentence - sentence b' FROM DUAL;
Outputs:
ID TEXT 1 A1 Text - words, A1 Text - wordsa 3 B1 Sentence - sentence, B1 Sentence - sentence b
Option 2:
From Oracle 12, if you want to find the first string and then all the strings that also start with that first string then you can use MATCH_RECOGNIZE
:
SELECT MIN(first_id) AS id,
LISTAGG(text, ',') WITHIN GROUP (ORDER BY text) AS text
FROM table_name
MATCH_RECOGNIZE(
ORDER BY text
MEASURES
FIRST(id) AS first_id,
MATCH_NUMBER() AS mno
ALL ROWS PER MATCH
PATTERN (matches )
DEFINE matches AS (text LIKE FIRST(text) || '%')
)
GROUP BY mno
Which, for the sample data, gives the same output (but may give a different output if there was more data as Option 1 is finding individual paths through the hierarchy and Option 2 is finding all children that match an ancestor).
In earlier versions, you should get the equivalent result using:
SELECT root_id AS id,
LISTAGG(text, ', ') WITHIN GROUP (ORDER BY ROWNUM) AS text
FROM (
SELECT DISTINCT
CONNECT_BY_ROOT id AS root_id,
id,
text
FROM table_name
START WITH
id IN (SELECT id
FROM table_name
WHERE CONNECT_BY_ISLEAF = 1
AND LEVEL = 1
CONNECT BY NOCYCLE
PRIOR text LIKE text || '%'
)
CONNECT BY NOCYCLE
text LIKE PRIOR text || '%'
ORDER SIBLINGS BY text
)
GROUP BY root_id
db<>fiddle here