I want to get all text inside a div with xpath
Here HTML code:
<div >
<div >
<div >
<div >
<div contenteditable="false" spellcheck="false" style="outline:none;user-select:text;-webkit-user-select:text;white-space:pre-wrap;word-wrap:break-word">
<div data-contents="true">
#Here the all text
<div data-block="true" data-editor="d54la" data-offset-key="bhkoa-0-0">
<div data-offset-key="bhkoa-0-0" >
<span data-offset-key="bhkoa-0-0" style="font-weight:bold">
<span data-text="true">Job Description:</span>
</span>
</div>
</div>
<div data-block="true" data-editor="d54la" data-offset-key="51e5u-0-0">
<div data-offset-key="51e5u-0-0" >
<span data-offset-key="51e5u-0-0">
<span data-text="true">· Identify & developed application base on predefined business requirements.</span>
</span>
</div>
</div>
...
#there's more, I'm just showing you a few
</div>
</div>
</div>
</div>
</div>
</div>
This my XPath code:
dom_job.xpath('//*[@]//text()')
I need the all text inside the div parent with xpath, can it?
CodePudding user response:
I'm assuming the Python module which provides your XPath interpreter supports XPath version 1. Your XPath expression below returns the set of all text nodes which are descendants of the div
element:
//*[@]//text()
You should be able to iterate over all that collection of text nodes, and concatenate them into a single string, in Python.
But it's simpler, if you want the concatenated value of the text nodes within a particular div
, to just apply the XPath string()
function to the div
; e.g.:
string(//*[@])
See https://www.w3.org/TR/1999/REC-xpath-19991116/#function-string
Note that, in XPath 1, if you apply the string()
function to a larger set of nodes (such as the set of text nodes returned by your first query), the function will return the string value of just the first node.