Home > Back-end >  How to escape a sup tag in xpath selector
How to escape a sup tag in xpath selector

Time:11-19

I want to extract the text content from the below HTML tag, but the <sup> tag is preventing me from getting the desired text.

The text I want to extract is simply (4:6, 6:7). how can I extract this text at the same time escaping the <sup> tag.

I tried this "//p/text()", but I am only getting the part before the <sup> tag (4:6, 6

my html tag

'<p ><span >Final result </span><strong>0:2</strong> (4:6, 6<sup>5</sup>:7)</p>

CodePudding user response:

It's the only text that is a direct text of p, the rest are texts inside a child tag.

scrapy shell file:///path/to/file.html

In [1]: ''.join(response.xpath('//p[@]/text()').getall())
Out[1]: ' (4:6, 6:7)'

CodePudding user response:

Try :

('//*[@]//following-sibling::sup/./..//text()').getall()
  • Related