I have a text like that.
<div>
blabla <b> BOLD </b> babla <b> BOLD2</b> blabla
<table width="100%">
<tr>
<td valign="top" style="width:30px">-</td>
<td>blabla <b>BOLD3</b></td>
</tr>
</div>
I want to extract the word between the b tag. But there is several path.
SELECT t1,t2
FROM mytext,
XMLTABLE (
'/div'
PASSING xmltype (raw_text.inhalt)
COLUMNS t2 VARCHAR2 (1000) PATH './b', --
t1 VARCHAR2 (1000) PATH './table/tr/td/b'))
[Error] Execution (11: 1): ORA-19279: XPTY0004 - XQuery dynamic type mismatch: expected singleton sequence - got multi-item sequence
CodePudding user response:
Assuming that your HTML is valid XML (which is not always the case as it needs a single root node and all opening tags to be closed) then given the sample data:
CREATE TABLE mytext (rawtext CLOB);
INSERT INTO mytext (rawtext) VALUES (
'<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/TR/xhtml1/strict" >
<div>
blabla <b> BOLD </b> babla <b> BOLD2</b> blabla
<table width="100%">
<tr>
<td valign="top" style="width:30px">-</td>
<td>blabla <b>BOLD3</b></td>
</tr>
</table>
</div>
</html>'
);
Then you can use:
SELECT bold
FROM mytext m
CROSS JOIN XMLTABLE (
XMLNAMESPACES(DEFAULT 'http://www.w3.org/TR/xhtml1/strict'),
'//b'
PASSING XMLTYPE(m.rawtext)
COLUMNS bold VARCHAR2(1000) PATH './text()'
)
Which outputs:
BOLD BOLD
BOLD2
BOLD3
db<>fiddle here
CodePudding user response:
This version works too. And you don't have to add more html markup.
You only have to remove XMLNAMESPACES(DEFAULT 'http://www.w3.org/TR/xhtml1/strict')
CREATE TABLE mytext (
rawtext CLOB
);
INSERT INTO mytext (rawtext) VALUES (
'<div>
blabla <b> BOLD </b> babla <b> BOLD2</b> blabla
<table width="100%">
<tr>
<td valign="top" style="width:30px">-</td>
<td>blabla <b>BOLD3</b></td>
</tr>
</table>
</div>'
);
SELECT bold
FROM mytext m
CROSS JOIN XMLTABLE (
'//b'
PASSING XMLTYPE(m.rawtext)
COLUMNS bold VARCHAR2(1000) PATH './text()'
)
https://dbfiddle.uk/?rdbms=oracle_21&fiddle=0889004081eb0c86bf3b869d953cf04e