How can I parse the second a tag from this div section. When I tried it always select the first one from the div children. How can I select the second so I can get the email.
<div >
Address:
<div style="padding-left: 1em">
Box 460
<br />
<a href="/canada/Clinton-Village.html"
>100 Mile House, British Columbia V0K 2E0</a
>
</div>
<br /><b>Enrollment:</b> 310<br />
<b>Grade span:</b> K-7<br />
<br /><b>School Type:</b> Standard School<br />
<b>School Category:</b> Public School<br />
<br /><b>Principal:</b> Mrs Donna Rodger<br />
<b>Phone (verify before using):</b> (250) 395-2258<br />
<b>Fax (verify before using):</b> (250) 395-3621<br />
<b>E-mail:</b>
<a href="mailto:[email protected]">[email protected]</a>
<br />
</div>
I tried using Xpath
emailElement = email_driver.find_element(By.XPATH, '//*[@id="main_body"]/div[3]/div[1]/div[1]/div[1]/div[1]')
result_email = emailElement.find_element(By.TAG_NAME, "a")
print(result_email.text)
Output
100 Mile House, British Columbia V0K 2E0
It keeps giving me the first tag. And I want to select select second tag
Expected output
I want to parse this section
<a href="mailto:[email protected]">[email protected]</a>
CodePudding user response:
Try with cssSelector/xpath instead of tagName.
cssSelector : By.cssSelector("a[href*='mailto:']")
or
xpath : By.xpath("//div[@class='col-md-4']/a[contains(@href,'mailto')]")
CodePudding user response:
Instead of
emailElement = email_driver.find_element(By.XPATH, '//*[@id="main_body"]/div[3]/div[1]/div[1]/div[1]/div[1]')
result_email = emailElement.find_element(By.TAG_NAME, "a")
print(result_email.text)
Try this:
emailElement = email_driver.find_element(By.XPATH, '//*[@id="main_body"]/div[3]/div[1]/div[1]/div[1]/div[1]')
result_email = emailElement.find_element(By.XPATH, ".//a[contains(@href,'mailto')]")
print(result_email.text)
You should also improve the '//*[@id="main_body"]/div[3]/div[1]/div[1]/div[1]/div[1]'
XPath expression, but I can't help there since you didn't share details about that.
You also possibly will have to use WebDriverWait
expected conditions to wait for element presence or visibility.
CodePudding user response:
There are many ways you can identify the element
Option 1: Find the tag which contains E-mail text and then find next sibling anchor tag
print(email_driver.find_element(By.XPATH, "//b[text()='E-mail:']/following-sibling::a[1]").text)
Option 2: Find the tag which contains E-mail text and then find next anchor tag
print(email_driver.find_element(By.XPATH, "//b[text()='E-mail:']/following::a[1]").text)
Option 3: Find the anchor tag, href starts-with() mailto
print(email_driver.find_element(By.XPATH, "//a[starts-with(@href,'mailto')]").text)
Option 4: Find the anchor tag, href starts-with(^ in css selector) mailto
print(email_driver.find_element(By.CSS_SELECTOR, "a[href^='mailto']").text)