Using import.io, given the following snippet, after successfully extracting the name
and time
columns, how might one extract the nearest preceding .heading
element as a third column using XPath?
...
<div >
<div >HBO</div>
</div>
<div >
<div >Silicon Valley</div>
<div >9pm</div>
</div>
<div >
<div >The Wire</div>
<div >10pm</div>
</div>
...
<hr>
<div >
<div >ABC</div>
</div>
<div >
<div >Lost</div>
<div >9pm</div>
</div>
<div >
<div >Heroes</div>
<div >10pm</div>
</div>
...
<hr>
...
CodePudding user response:
The nearest element that comes before the matched data with a class of "heading".
The nearest preceding element from a given element can be found with the preceding
axis in XPath. Suppose we have the expression div/div[class='name'][. = 'Heroes']
, which selects the last name in your example, the nearest preceding one would be:
./preceding::div[@class = 'heading'][1]
where .
is either a genuine context node in which case you can remove ./
, or it should be replaced with the rest of the expression that you already have.
Since the preceding axis counts backwards, we just want the first element found. Note that the preceding axis does not select ancestors or self nodes, counting from the current node.