Find specific link on a website with Selenium (in Python)-CodePudding

I'm trying to scrape specific links on a website. I'm using Python and Selenium 4.8. The HTML code looks like this with multiple lists, each containing a link:

<li>
  <div  >
    <div >
      <h4 >
        <a  href="https://www.example_link1.com">
        </a>
      </h4>
    </div>
  </div>
</li>

<li>...</li>
<li>...</li>

So each < li > contains a link.

Ideally, I would like a python list with all the hrefs which I can then iterate through to get additional output.

Thank you for your help!

CodePudding user response：

You can try something like below (untested, as you didn't confirm the url):

[...]
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
[...]

wait = WebDriverWait(driver, 25)
[...]
wanted_elements = [x.get_attribute('href') for x in wait.until(EC.presence_of_all_elements_located((By.XPATH, '//li//h4[@]/a[@]')))]

Selenium documentation can be found here.

CodePudding user response：

from selenium import webdriver

driver = webdriver.Firefox()
driver.get("https://www.example.com")
lis = driver.find_elements_by_xpath('//li//a[@]')
hrefs = []
for li in lis:
    hrefs.append(li.get_attribute('href'))
driver.quit()

This will give you a list hrefs with all the hrefs from the website. You can then iterate through this list and use the links for further processing.