Im struggling with scraping a few pages ... it happens when the structure of the page implies a lot of nested divs... Here is the code page:
<div>
<section role="tab" id="ui-id-1" aria-controls="ui-id-2" aria-selected="false" aria-expanded="false" tabindex="0"><span ></span>
<div >
<div >Me <span >NAME </span></div>
<div >Avocat postulant au Tribunal Judiciaire</div>
</div>
<div >Plus d'informations</div>
</section>
<div style="display: none;" id="ui-id-2" aria-labelledby="ui-id-1" role="tabpanel" aria-hidden="true">
<div >
<div >
<div >
<span>Structure :</span>
<div>
<p>Cabinet individuel NAME</p>
</div>
</div>
</div>
<div >
<div >
<span>Adresse :</span>
<div>
<p>21 rue Belle Isle 57000 VILLE</p>
</div>
</div>
</div>
<div >
<div >
<span>Mail :</span>
<div>
<p>[email protected]</p>
</div>
</div>
</div>
<div >
<div >
<span>Tél :</span>
<div>
<p>Telnum</p>
</div>
</div>
</div>
<div >
<div >
<span>Fax :</span>
<div>
<p> </p>
</div>
</div>
</div>
<div > <a href="mailto:[email protected]">Contacter</a> </div>
</div>
</div>
</div>
And here is my python code:
divtel = self.driver.find_elements(by=By.XPATH,
value=f'//div[@]/div/p')#div[@]')
for p in divtel:
print(p.text)
It doesnt print anything...with other similar pages it prints the text but in this case it doesnt altough there is text in the nested span and div/p . Do you know why?
How can i resolve my problem please? thank you
CodePudding user response:
The method .text
works only when the webelement containing the text is visible in the webpage. If otherwise the webelement is hidden, you have to use .get_attribute('innerText')
or .get_attribute('textContent')
or .get_attribute('innerHTML')
(see here for difference between them). So for example change
print(p.text)
to
print(p.get_attribute('innerText'))